Journal of Applied Psychology 


Joun G. Dartey, Editor 
University of Minnesota 





Table of Contents 


Predicting Trademark Effectiveness: H. A. Burdick, E. J. Green, and J. W. Lovelace....... 285 


Studies in Management Training Evaluation: II. The Effects of Exposure in Role Playing: 
C. H. Lawshe, R. A. Bolda, and R. L. Brune 


Follow-up on the Validity of a Forced-Choice Study Activity Questionnaire in Another Setting: 
The Use of IBM Mark-Sense Cards as Multiple-Choice Paper-and-Pencil Test Answer Forms: 
E. Madril 296 


Relationships between Personal and Social Desirability Sets and Performance on the Edwards 
Personal Preference Schedule: A. B. Heilbrun, Jr. and L. D. Goodstein 


Subordinates’ Perceptions of the Productive Engineer: R. E. Stoltz 

Factor Analysis of Reported Minor Personal Mishaps: J. D. Keehn....................... 
Job Satisfaction Study of Two Small Unorganized Plants: B. J. Speroff 

Numerical Error Checking: E. T. Klemmer 

Cognitive Similarity and Interpersonal Communication in Industry: H. C. Triandis......... 
A Femininity Adjective Check List: R. A. Berdie 

Prediction of Consumer Purchase and the Utility of Money: L. V. Jones.................. 


Categories of Thought of Managers, Clerks, and Workers About Jobs and People in an In- 
dustry: H. C. Triandis 


Relationships between a Top-Middle Management Self-Description Scale and Behavior in a 
Group Situation: L. W. Porter and R. A. Kaufman............... 0.000 c cece cece ees 





American Psychological Association 


Volume 43, Number 5 October, 1959 





Consulting Editors 


Haroip E. Burtt, Ohio State University 

AtpHonse CHAPANIS, Johns Hopkins Uni- 
versity 

Currrorp E. JurGENSEN, Minneapolis Gas 
Company 

Laurence S. McGaucuran, University of 
Houston 

Quinn McNemar, Stanford University 


= a. Mintz, City College of New 

or 

Haroitp F. Rotnue, Fairbanks, Morse and 
Company 

Juuian B. Rotter, Ohio State University 

Tuomas A. Ryan, Cornell University 

Donatp E. Super, Columbia University 

Mies A. Tinker, University of Minnesota 

Atrrep C. WetcH, University of New 
Mexico 





This journal gives primary consideration to origi- 
nal investigations in any field of applied psychol- 
ogy except clinical and consulting psychology, al- 
though a descriptive or theoretical article may be 
accepted if it represents a special contribution in 
an applied field. Quantitative investigations of in- 
terest or value to psychologists working in the fol- 
lowing broad fields will be considered: vocational 
and educational prognosis, diagnosis, and guidance 
at the secondary and college level; personnel re- 
search in business, industry, and government; bio- 
mechanics; industrial working conditions; research 
on opinion and morale factors; job analysis and 
classification research; market and advertising re- 
search. 


Because of the large number of manuscripts sub- 
mitted, authors should adhere to the rule of 


“brevity consistent with clarity.” The typical 
manuscript should run to approximately 4,000 
words. There is a lag of approximately twelve 
months between receipt and publication of an 
article. Authors may request advanced publica- 
tion if they are prepared to pay the cost of print- 
ing the necessary extra pages. 


Manuscripts should be addressed to the Editor, 
John G. Darley, 408 Johnston Hall, University of 
Minnesota, Minneapolis 14, Minnesota. All manu- 
scripts should be submitted in duplicate. Original 
figures are prepared for publication; duplicate fig- 
ures may be photographic or pencil-drawn copies. 


Manuscripts must conform to the style require- 
ments described in the Publication Manual of the 
American Psychological Association. 





Journal of Applied Psychology 


Published bimonthly by the 
American Psychological Association 
Prince and Lemon Sts., Lancaster, Pa. 
and 1333 Sixteenth Street N.W. 
Washington 6, D. C. 


$8.00 per volume 


$1.50 per issue 


Artuur C. Horrman, Managing Editor; HeLen Orr, Promotion Manager; Hersert Newt, Editorial Assistant 


Subscriptions, orders, and business communications should be addressed to the American Psychological Association, 
1333 Sixteenth St. N.W., Washington 6, D. C. Address changes must reach the subscription office by the 10th of 
the month to take effect the following month. Undelivered copies resulting from address changes will not be replaced; 


subscribers should notify the post office that they will guarantee second-class forwarding postage. 


Other claims for 


undelivered copies must be made within four months of publication. 
Second class postage paid at Lancaster, Pennsylvania and at additional mailing places. 
© 1959 by the American Psychological Association, Inc. 





Journal of Applied Psychology 





= 





VoL. 43, No. 5 


OcTOBER, 1959 








PREDICTING TRADEMARK EFFECTIVENESS 


HARRY A. BURDICK, EDWARD J. GREEN, anp JOSEPH W. LOVELACE ! 
Dartmouth College 


Under the sponsorship of Raymond Loewy 
Associates, we undertook the following study. 
The task which confronted us may be related 
in terms of a single question, namely: “How 
effective is a given trademark in comparison 
to six other trademarks, one of each of the 
product’s leading competitors?” The ques- 
tion is a simple and reasonable one, un- 
doubtedly of major significance to all adver- 
tisers. The answer is not so simple, mostly 
because the word “effective” is so difficult to 
define. In the following report, we will pre- 
sent the definition which we used and the 
correlates we found to this definition of effec- 
tiveness. 

We decided that for a trademark to be an 
effective trademark it must have the follow- 
ing properties: 

1. Salience—it must be readily seen and 
recognized. 

2. Meaningfulness—it must convey, through 
associations, connotations which are favorable 
and significant to the observer. 

3. Memory-value—it must be readily re- 
membered by the viewer. 

With these criteria, we set out to investigate 
the relative effectiveness of each of seven 
trademarks. 

Method 


Subjects 
In all, 166 male Ss were tested. 
these Ss (40) participated in the salience and mean- 


A proportion of 


ing phase of this experiment. The other 126 partici- 
pated in the meaning and memory-value phase of the 
experiment. All Ss were paid for their participation. 
The 40 Ss who performed in the salience aspect had 
vision corrected to 20/20. 


Materials 
Official trademarks of seven companies were ob- 
tained. These were enlarged photostatically and the 
official colors applied by air brush. Ektachrome 
transparencies were prepared of each of the emblems. 
1J. W. Lovelace is with the Raymond Loewy As- 
sociates. 


285 


In the memory-value aspect of the study, all of the 
trademarks were presented on one slide. Four dif- 
ferent arrangements of the trademarks were photo- 
graphed for this, so that whatever effects position 
might have would be reduced. Ektachrome trans- 
parencies of the composites were produced. 


Salience and Meaning Procedure 

Ss appeared in a testing laboratory individually. 
They were met by E, seated before a screen, and 
told that words would be flashed upon the screen. 
S was to report everything he saw. Slides were 
shuffled before each series to control the effect of 
position in the series. Fer each S, the point at 
which correct identification occurred was recorded 
by E. 

The slides were shown through a matched pair of 
300-v. projecting tachistoscopes which were placed 
64 ft. from the screen. -The screen was held a con- 
stant illumination to reduce the reading of after- 
images. Preliminary investigation showed that no 
one recognized the trademarks at exposure intervals 
less than 1/50 sec. without repeated presentations. 
Therefore, our series began here, with all slides being 
presented at 1/50 sec., then all slides at 1/25 sec., all 
at 1/10, 1/5, 1/2, and 1 sec. After this part of the 
study was completed, S was given a ten-item copy 
of the Semantic Differential described by Osgood, 
Suci, and Tannenbaum (i957). He was shown each 
slide for as long as he wished and evaluated the 
trademark in terms of 10 seven-point dimensions. 
This is our operational definition of “meaning.” 


Memory-Value and Meaning Procedure 


Groups met and were told they would be asked 
to view a number of slides. Half of the group was 
sent out of the room while the others remained. 
The ones remaining were provided with forms for 
the Semantic Differential, and asked to turn the 
forms over and write on the back. The technique 
used to test memory-value was the Aussage tech- 
nique. 

One of the four transparencies containing a view 
of all of the trademarks was presented to the group 
for 15 sec. Then Ss were asked to write down all 
of the trademarks which ‘they recalled. The pro- 
cedure was duplicated for the other half of the 
group, using a second slide with different positional 
arrangement. After responses were made to the com- 
posites, each trademark was presented alone, and Ss 
in the group were asked to evaluate it on each of 
10 dimensions of the Semantic Differential. 





H. A. Burdick, E. J. Green, and J. W. Lovelace 


Results 

Salience 

Mean recognition thresholds were computed 
for each of the trademarks. Trademarks 
were then ranked, with the trademark having 
the lowest threshold receiving a rank of one. 
Meaning 

Following a recommendation of Osgood 
et al. (1957), the characteristics of evalua- 
tion, potency, and activity were taken to be 
most significant in designating “meaning.” 
Good-bad, strong-weak, and active-passive 
were taken to be the focal dimensions of 
evaluation, potency, and activity. Scores on 
these dimensions were summed for each trade- 
mark, and a rank ordering from most to least 
along the dimension was obtained. The rank 
on the evaluation dimension was doubled, as 
Osgood et al. suggest, and added to the rank 
of the other two dimensions. A final rank 
order of the trademarks for the three dimen- 
sions combined was thus obtained. 


Memory-Value 

A score of seven was assigned to a trade- 
mark every time it was the first of the group 
which was written down by S, a six for each 
time it was second, and so on. If it did not 
appear at all, it was given a score of zero. 
From the totals of the summed scores, the 
trademarks were ranked in terms of the de- 
gree to which they stood out and were re- 
membered by S. The trademark having the 
highest score was rank one, the second two, 
and so on. 

Using a Spearman rho, rank order correla- 
tions were computed for each of the aspects 
of effectiveness with one another. On the 
basis of face validity, the memory-value as- 
pect of our effectiveness definition was felt to 
be most critical. Hence the correlations of 
salience and meaning with memory-value were 
of special importance. Finally, a composite 
rank order was composed by combining the 
ranks of the salience and meaning dimensions, 
giving them equal weight, and this was cor- 
related with the memory-value order. The 
matrix of correlations is presented in Table 1. 

As can be seen in Table 1, all of the as- 
pects correlate positively with one another. 
Of particular interest is the exceptionally high 
correlation between memory-value and the 
composite ordering of the salience and mean- 
ing aspects. 


Table 1 


Intercorrelation of Effectiveness Measure 


Wit. Aussage 


Visual Threshold af Bg 
2* 


Semantic Differential 8 
Aussage 


V.T. and S.D. Combined .94** 


* Significant at .05 level. 
** Significant at .01 level. 


Discussion 

We feel that the results of this study indi- 
cate a great deal of power is to be obtained 
from a straightforward approach to a par- 
ticularly practical but slippery problem. The 
approach has involved techniques taken from 
two subdisciplines of psychology, namely, the 
sensory and cognitive areas. Through an op- 
portune choice of variables in these areas, we 
have demonstrated the significance of the 
variables for either an evaluation of an exist- 
ing emblem or of an original, new trademark. 
This obviously does not mean that the most 
effective trademark, by this definition, will 
necessarily give rise to an increase in sales or 
connote integrity or honesty of the firm sym- 
bolized. But it is even more apparent that 
such desired ends are not to be fostered by 
making the trademark less visible or less fa- 
vorable in its meaning. The first step to- 
ward “effectiveness” in this latter sense is 
the making of the trademark as effective as 
possible in salience, meaning, and memory- 
value. 

Summary 

We have investigated the effectiveness of 
seven competing trademarks. Effectiveness is 
defined in terms of the salience, meaning, and 
memory-value of the trademark. The three 
characteristics were found to be positively re- 
lated. Further, taking memory-value to be 
our major dependent variable, we have found 
that, through a cdmbination of our salience 
and meaning measures, we were able to pre- 
dict the memory-value with a high degree of 
success. 


Received December 29, 1958. 


Reference 


Osgood, C. E., Suci, G. J., & Tannenbaum, P. H. 
The measurement of meaning. Urbana: Univer. 
Illinois Press, 1957. 





Journal of Applied Psychology 
Vol. 43, No. 5, 1959 


STUDIES IN MANAGEMENT TRAINING EVALUATION: 
II. THE EFFECTS OF EXPOSURES TO ROLE PLAYING‘ 


Cian 


LAWSHE,? ROBERT A. BOLDA,® anv R. L 


BRUNE #4 


Occupational Research Center, Purdue University 


Human relations training in management is 
characterized by variability with respect to 
training content and training technique. To 
the extent that the content is appropriate to 
the accomplishment of the training objective, 
and the training technique is capable of fa- 
cilitating learning, the training presentation 
can be considered effective. The empirical 
evaluation of both training content and train- 
ing devices is required. In particular, it is 
felt that research in the now-popular partici- 
pative training techniques is necessary; such 
investigations would indicate the particular 
virtues of each approach and the specific 
types of applications which result in optimal 
effectiveness. 

A series of studies has been undertaken to 
evaluate role playing as a tool in manage- 
ment human relations training. While there 


are several variations of role playing, experi- 
mental interest has been centered on the most 
popular industrial method, skit completion. 
In this approach, trainees are presented with 
a case involving the development of a prob- 
lem situation and are required to enact a 


completion of the scene spontaneously. For 
example, one trainee may be asked to play 
the part of the foreman, one the employee, 
and, in some instances, the remaining trainees 
are required either to observe the action pas- 
sively or to identify with one of the role 
players. It is the purpose of this article to 
summarize five studies which have been con- 
ducted to evaluate the effects of single and re- 
peated role playing experiences. Three groups 
received one session, and two groups partici- 
pated in four weekly role playing sessions. 

1 This research was carried out under a grant from 
the Foundation for Research on Human Behavior, 
administered through the Purdue Research Founda- 
tion. 

“Now with University Extension, Purdue Univer- 
sity. 

3 Now with Chevrolet Engineering Center, General 
Motors Corporation 

*Now with the 
Missiles Division 


Lockheed Aircraft Corporation, 


287 


Evaluation criteria. A “work sample” hu- 
man relations training case was administered 
to trainees in pre- and posttraining situations 
in order to obtain indices of human relations 
performance levels. An incomplete-type case, 
the Case of the Reddened Eyes,® was selected 
for this purpose after considerable preliminary 
research (Lawshe, Bolda, & Brune, 1958) 
This case presents the development of a fore- 
man-employee problem situation, and con- 
cludes at the point where some form of su- 
pervisory action is required. After viewing 
the sound-slide film, trainees are asked to 
give open-end responses to the following ques- 
tions: 


1. If you were the foreman, what would 
you do now? 

2. Why did the employee behave the way 
she did? 


Responses to the first questions were scaled 
on a continuum of Employee-orientation, in- 
dicating the extent to which the response 
tended to cope with the human problem pre- 
sented in the case. Responses to the second 
question were scaled on a continuum of Sensi- 
tivity, reflecting the extent to which the re- 
sponse indicated a perceptiveness of the so- 
cial cues presented in the film. Scaling pro- 
cedures are described in an earlier article in 
this series (Lawshe et al., 1958). It is pos- 
sible to evaluate the effects of the training 
experiences by comparing pre- and posttrain- 
ing response scale scores. 


Procedure 


Study 1. A campus conference group of residential 
contractor foremen participated in this study.® At 
the outset of the period, the Case of the Reddened 
Eyes was shown and responses to the two criterion 


Description given in McGraw-Hill News Bulletin 
L-23129 TF, announcing the filmstrip series: Super- 
visory Problem in the Plant 

6 The authors wish to acknowledge the coopera 
tion of George E. Davis and Merle McClure, Divi 
sion of Adult Education, Purdue \‘niversity. 





288 C. H. Lawshe, R. 


questions were obtained. Forty of the foremen were 
randomly divided into two role-playing treatment 
groups. The first 20 were randomly assigned to five- 
man role playing subgroups. Within each sub- 
group one S was assigned to each of the following 
role positions: foreman role player, worker role 
player, foreman identifier, worker identifier and ob- 
server. The role players were instructed to enact a 
completion of the criterion case. A graduate student 
leader in each subgroup got the sessions started and 
led a post-role-playing discussion period. Role play- 
ing sessions lasted from 10 to 20 minutes and dis- 
cussion periods ranged between 15 and 25 minutes 
(The leaders were instructed to minimize group de 
cision with regard to the “real reasons for the em 
ployee’s behavior.”) No attempt was made to pre- 
sent a single best action alternative to the case. 

The twenty foremen in the second treatment group 
were similarly divided into five-man subgroups 
These participants were assembled in a_ separate 
room and shown the sound slide film Case of the 
Reluctant Electrician. Role position assignments 
were given as described above. The men role played 
and discussed this second case only. Discussion char- 
acteristics and time allotments were identical with 
those mentioned above. 

At the conclusion of the discussions, the trainees 
were reassembled in a group and the Case of the 
Reddened Eyes was readministered. 
the criterion questions were obtained again. 


Responses to 
All re 
sponses were scaled by the abbreviated master scal- 
ing procedure previously described. The reliabilities 
of the average scale scores assigned by four expert 


judges were .95 and .93 for the Employee-orienta- 


tion and Sensitivity scales, respectively. These fig- 
ures are average rater intercorrelations stepped up 
by a factor of four. 

Study 2. Participants in the second study were 
taken from a group of school custodians meeting in 
a campus conference session. Twenty-six Ss were 
randomly selected for role playing. One-half of the 
men were taken aside and given the foreman role 
instructions for the Case of the Stubborn Employee 
(Maier, 1952); the other 13 were instructed in the 
worker’s role. In this case, the formen role players 
are asked to present an order-giving demonstration 
by instructing the employee to install storm windows 
on the first and second floor of a building. The 
worker role players are instructed that they are 
“afraid of high places,” a fact which the foreman 
does not know beforehand. Thirteen pairs of role 
players were formed and role playing was allowed to 
continue until all pairs had completed the skit (about 
25 minutes). The leader then informed the “fore- 
men” of the height factor. 

The Case of the Reddened Eyes was administered 
to the participants both before the role playing and 
immediately after. Responses to the Sensitivity and 
Employee-orientation questions were obtained after 
each administration, and were identified according to 
the role position of the respondent. 
responses were scaled by the 
scaling procedure. The 


The open-end 
abbreviated master 
reliabilities of the average 


A. Bolda, and R. L. Brune 


scores assigned by four expert judges were .91 and 
.93 for the Sensitivity and Employee-orientation re- 
sponses, respectively. 

Study 3. The participants were 27 staff super- 
visors in an educational institution participating in 
a 13-week management training program. The ex- 
perimenters made two-hour presentations to this 
group during two successive weeks. On the first 
week, the participants were randomly divided into 
four- and five-man subgroups to role play the Case 
of the Stubborn Employee. Role position assign- 
ments were made in the manner described in Study 1. 
Only the worker role players were informed about 
the fear of heights. Role playing was allowed to 
continue until all subgroups had completed the skits 
At that time, the leader revealed the height problem 
and encouraged group discussion of the case. He 
attempted to emphasize the importance of “finding 
out why.” 

During the second presentation a week later, a 
tape recording of a role playing session'on the same 
case was presented, and the leader attempted to inte- 
grate this action with lecture material on social per- 
ception. The consequences of incomplete informa- 
tion on supervisory actions were emphasized. At the 
end of the second period, the Case of the Reddened 
Eyes was shown and responses to the two criterion 
questions were obtained. It was possible to identify 
respondents according to their role positions during 
the first week. No premeasure was taken because 
time requirements did not permit. 

The open-end responses were scaled by the ab- 
breviated master scaling procedure. The reliabilities 
of the average scores assigned by four judges were 
.93 and .96 for the Sensitivity and Employee-orienta- 
tion scales, respectively. 

Study 4. The participants were 16 male line and 
staff supervisors participating in a company manage- 
ment development program. At the start of the 
first period, they were shown the filmstrip, The Case 
of the Reddened Eyes, and were asked to write their 
responses to the criterion questions. 

The open-end responses were subsequently scaled 
on the Employee-orientation and Sensitivity continua 
The supervisors were divided randomly into four- 
man: groups and one pair of trainees in each group 
was asked to assume the foreman and worker roles 
in a role playing skit on the criterion case. Other 
group members acted as observers. After the role 
playing demonstrations were completed, the experi- 
menter led a discussion of reasons for the employee’s 
behavior, and similarities between the case and their 
own experiences. No attempt was made to lead the 
trainees to group decision with respect to either (a) 
the “real” reasons for the employee’s resistance, or 
(b) the “best” way to handle the case. It should be 
pointed out that the trainees had been meeting as a 
group for several months prior to the experiment 
and had participated in many discussion-type train- 
ing experiences. For this reason, it was not possible 
to steer the group away from general decision with 
respect to the “Why” factor, although every attempt 
was made to minimize this. 





Studies in Management 


During the three subsequent weeks, the trainees 
were presented with other training cases for role 
playing and discussion. In every case, the post-role- 
playing discussion periods were handled in the man- 
ner outlined above. On the fifth week, the Case of 
the Reddened Eyes was readministered, and responses 
to the two stimulus questions were obtained. In ad- 
dition, the participants completed a role playing 
evaluation questionnaire (Lawshe, Brune, & Bolda, 
1958). After the final session, the open-end re- 
sponses from the first and fifth weeks were pooled 
and scaled by a forced-sort scaling procedure. 

Study 5. The Ss were 29 supervisors from a va- 
riety of industries enrolled in an institutional five- 
week Human Relations Training course. Fourteen 
persons attended a morning session; 15 were enrolled 
in an afternoon session. The same instructor pre- 
sented the same material in both two-hour sessions. 
The morning participants became the experimental 
group and were available to the experimenters for 
the first hour of each week. The afternoon trainees 
were designated the control group. 

At the first session, both groups were shown the 
criterion film strip and wrote their responses to the 
Sensitivity and Employee-orientation questions. In 
addition, they were asked to indicate why they de- 
cided upon a particular course of action by marking 
one item in the following check list 


a Convince the girls that the lights were O.K 
Patch up Joan’s hurt feelings 
Show the girls that you are still the boss 
Assure Joan that she is important. 
Find out if the lights are harmful to the 
eyes. 
( Make Joan an example to the other girls 


The check list forms the basis of an “Intentions” 
classification. The experimenters then presented the 


Training Evaluation: II 289 
Case of the Stubborn Employee (Maier, 1952) to 
the experimental group for role playing. 

During the three subsequent training sessions, cases 
were presented for role playing which illustrated both 
the “Why” aspects of human relations, and also the 
“What to do” factors. Discussion periods after role 
playing sessions were directed so as to emphasize 
both the importance of finding out “Why” and the 
importance of selecting an appropriate course of su- 
pervisory action. 

The control group simply discussed the technical 
aspects of the Case of the Reddened Eyes for sev- 
eral minutes; the experimenters then left the room 
and did not reappear until the fifth week. 

At the fifth session, both groups were again shown 
the Case of the Reddened Eyes. Sensitivity, Em- 
ployee-orientation and Intentions responses were ob- 
tained. In addition, experimental group members 
completed a 17-item role playing questionnaire 

Sensitivity and Em ployee-orientation responses were 
scored by four judges using the abbreviated master 
scaling method. The reliabilities of the average 
scores were .96 and .95, respectively. Responses to 
the Intentions check list were treated as classifica- 
tion frequencies. 


Results 

Single exposure studies. Analyses of vari- 
ance were applied to the Sensitivity and Em- 
ployee-orientation scale scores obtained in 
Study 1. None of the main effects of inter- 
actions was found to be significant. The pre- 
and postposition and treatment means are 
shown in the top section of Table 1. 

Simple ¢ tests were applied to the differ- 
ences between pre- and postmeasures on fore- 


Table 1 


Pre- and Postresponse Mean Scores in Single Exposure Studies 


Study Role Position 


Study 1 Foremen 

Workers 

Foremen identifiers 
Worker identifiers 
Observers 


Study 2 Foremen 


Workers 
Foremen 
Workers 


Foremen identifiers 


Study 3 


Worker identifiers 
Observers 


Sensit ivity 
Scale 


Employee-orientation 
Scale 


Post Pre 
43.0 
48.4 
38.8 
31.3 
35.4 


52.9 
44.7 
46.2 
39.9 
50.7 
36.0 
28.2 


32.4 
46.0 


66.: 





C. H. Lawshe, R. 


A. Bolda, and R 


L. Brune 


Table 2 


Pre- and Postresponse Mean Scores in Repeated Exposure Studies 


Role Position 
Study Ist Week 
Foreman 
Worker 


Observer 


Study 4 


Study 5 (All Control Members) 
(Ail Exptl. Members 
Foreman 


W orkers 


man and worker role players in Study 2. The 
Employee-orientation increase for the fore- 
man role players (from 32.4 to 37.8) achieved 
significance (¢ = 2.00). Foreman role play- 
ers’ scores on Sensitivity also increased (NS), 
while worker role players’ scores were not sig- 
nificantly different on either measure. Re- 
sponse means for Study 2 are shown in the 
center section of Table 1. 

One-way classification analyses of variance 
were applied to the data obtained in Study 3, 
treating role position as the variable of classi- 
fication. Neither Sensitivity nor Employee- 
orientation means scores differed significantly 
by role position on the basis of the over-all 
F test. The response means are shown in the 
lower section of Table 1. Since the experi- 
menters had decided a priori to test for dif- 
ferences between foreman and worker role 
player means, simple ¢ tests were applied to 
these data on both measures. While foreman 
role players tended to give better responses 
on both scales, only the difference on the 
Sensitivity measure was significant (t = 2.18). 

Repeated exposure studies. The over-all 
pre- and posttraining Sensitivity mean scores 
(first and fifth week responses) in Study 4 
were 47.0 and 63.0, respectively. The dif- 
ference between these means is significant 
(t = 2.44). That the shift was characteristic 
of all Ss, rather than being differentiated ac- 
cording to the S’s role position on the first 
week, is disclosed in Table 2. 

The Employee-orientation means for all 
persons were 58.0 on the first week and 50.4 
for the fifth. This difference is not signifi- 
cant. A comparison between means of per- 


Sens. Scale 


Employee-orientation 


Wk. 1 Wk. 5 Wk. 1 Wk. 5 
63.1 60.8 
69.1 61.9 


57.6 


46.6 
44.4 


52.39 


40.9 
53.9 54.3 
48.2 56.3 


57.8 : 49.7 


54.9 


sons who had assumed various of the role po- 
sitions on the first week did not demonstrate 
differential effects on this variables. The re- 
sponse means are shown in Table 2. 

Unweighted means analyses of variance were 
applied to the Sensitivity and Employee-ori- 
entation scores obtained in Study 5 in the 
control and experimental groups. These analy- 
ses revealed that the groups were homogene- 
ous on the first week with respect to scores 
on both criterion variables. 

However, an examination of the Group X 
Weeks interaction revealed that the experi- 
mental group improved significantly on the 
Sensitivity dimension while the control group 
did not. A similar analysis applied to the 
Employee-orientation data failed to disclose 
significant improvements for either group. 
The response means in Table 2 show that 
members of the experimental group improved 
somewhat on this variable while control mem- 
bers maintained their initial score levels. 

The experimental group members had been 
either worker or foreman role players on the 
Case of the Stubborn Employee on Week 1. 
The means of these subgroups appear in the 
lower part of Table 2. The foreman role 
players significantly improved their Sensitiv- 
ity performance (p < .01). No other mean 
differences achieved significance. 

Regarding /ntentions, the check list items 
(b) and (d) represent “good” responses. The 
remaining items are designated as “other.” 
The frequencies in Table 3 are based on the 
persons who initially indicated “other” re- 
sponses. The table shows the number of per- 
sons within the groups who repeated “other” 





Studies in Management 


Table 3 


Intention Changes between Weeks 1 and 5 
(Study 5) 


No. of Persons 
Shifting 
to ““Good”’ 


Category 


No. of Persons 
Staying 
in “Other” 
Category 


Group 


Experimental 4 
Control 7 


responses or changed to “good” responses. 
The Fisher exact probability test (Siegel, 
1956) yielded a p = .048 (one-tailed). The 
more favorable shift of Intention response oc- 
curred among the experimental Ss. 


Discussion and Conclusions 


The pattern of results obtained in these five 
studies suggests the tenability of postulating 
impact as a factor in effecting change in hu- 
man relations dimensions. Jmpact is here de- 
fined as a characteristic of a training experi- 
ence which (a) allows the trainee to criticize 
his own performance in human relations tasks, 
(6) provides an adequate type of feedback to 
the trainee regarding his performances, and 
(c) serves to emphasize a particular human 
relations factor in a strong, emotional man- 
ner. The authors interpret these results as 
indicating the absence of impact in Study 1. 
In that study, participants were allowed to 
role play and discuss the cases without direct 
indication of the “real adequacy” of their 
performances and perceptions. As a result, 
no response changes on the criterion case were 
noted. 

The impact factor, however, was present in 
Studies 2, 3, 4, and 5. The first two groups 
role played only the Case of the Stubborn 
Employee. It has been the author’s experi- 
ence that only about 5 to 10% of the fore- 
man role players discover the height problem 
in this case. Consequently, many foremen 
players resolve the problem situation by dis- 
charging, transferring or otherwise taking ag- 
gressive action against the “stubborn” em- 
ployee. The revelation that there is a real 
causative factor behind the reluctance con- 
tributes to an impact experience. The effects 
of this experience seem to have generalized to 
improved performance on a second training 


Training Evaluation: Il 291 
case, the Case of the Reddened Eyes. It 
should be noted that impact, as evidenced 
by significant score improvement, occurred in 
Study 2, Study 3, and Study 5 for only the 
foreman role players, in two instances on the 
Sensitivity dimension and in the third on Em- 
ployee-orientation. 

The effects of repeated exposures to role 
playing evidenced in both Study 4 and Study 
5 are reflected in increased Sensitivity scores. 
In Study 4 it was found that a standard train- 
ing program did not effect these changes. It 
was proposed that only the participant who 
is placed in a problem-solving situation and 
is made to realize the inadequacy of his initial 
response is likely to benefit from the experi- 
ence. The foreman role players in the Case 
of the Stubborn Employee (Study 5) had this 
opportunity. The fact that Sensitivity im- 
provements in Study 4 were not differentiated 
according to role positions on the first week 
indicates that the impact experience was a 
general one for all participants, and that the 
impact may very well have occurred as a re- 
sult of the type of discussion which followed 
role playing. 

Neither of the repeated exposures pre-post 
Employee-orientation comparisons resulted in 
rejection of the null hypothesis. Although 
the trend in Study 5 is in the direction pre- 
dicted, the “error” variability is sufficiently 
large to negate the mean score improvement. 
The authors suggest that the observed im- 
provement, however, is a reflection of (a) the 
types of cases presented, and (6) the types of 
post-role-playing discussions held. 

The lack of Employee-orientation change in 
Study 4 is not easily explained. Although the 
experimenters attempted to lead a “Why’- 
oriented discussion, the participants succeeded 
in bringing the “What to do” element into 
the picture. This discussion is not reflected 
in scale score shifts. Of all the possible ex- 
planations for this phenomenon, the authors 
prefer the following: The trainees were all 
members of one management organization. 
To the extent that organizational patterns of 
“accepted” behavior are present, training can 
be expected to have little effect on action re- 
sponses. If these managers have a common 
attitude toward employee problem situations, 
and a homogeneous method for classifying 





292 


these problems, abbreviated training would be 
relatively ineffective in altering these factors. 

In Study 5, there is some evidence that 
Employee-orientation scores and Intentions 
are differentially affected. 

The results cited here suggest the follow- 
ing conclusions with regard to exposures to 
role playing: 


1. Experiences with the skit-completion 
method of role playing are effective in pro- 
ducing Sensitivity and Employee-orientation 
changes to the extent that impact occurs in 
the training. 

2. The beneficial effects of such an experi- 
ence are capable of transferring to a second 
or novel human relations problem situation. 

3. Repeated exposures to role playing as 
administered in these studies, contribute little 
to criterion response improvements. 

In both repeated exposure studies, signifi- 
cant Sensitivity shifts occurred where impact 
was involved on the first week. There is 
some indication that case selection and type 
of discussion can increase Employee-orienta- 
tion. 

4. Impact experiences may occur either as 


a result of the type of case used, or as a re- 
sult of the type of discussion held after role 
playing. 

5. Improved Sensitivity responses are not 
necessarily accompanied by better Employee- 
orientation responses. 


If it is accepted that an objective of human 
relations training is to increase the likelihood 
of adopting courses of action appropriate to 
human problem situations, the authors sug- 
gest that the “What to do” aspect must be 
emphasized throughout the training. The 
phenomena observed in these studies may be 
due to the relative difficulty of training Em- 
ployee-orientation. 


C. H. Lawshe, R. A. Bolda, and R. L. Brune 


Summary 


Five studies were conducted to evaluate the 
effects of single and repeated exposures to 
the skit-completion method of role playing. 
Evaluation criteria consisted of scaled re- 
sponses to a standard human relations train- 
ing case in two dimensions: Sensitivity and 
Employee-orientation. Criterion responses 
were obtained before and after role playing 
in four subject groups, and after the training 
in a fifth group. Various role playing treat- 
ment conditions and role assignments were 
investigated. It was found that changes in 
criterion case responses were effected in only 
those instances where impact occurred in con- 
nection with the training experience. Im- 
pact is effected by (a) case materials and 
(6) the type of discussion held after role 
playing. In addition, in those cases where 
the impact factor was evident, the effects of 
this experience were capable of generalizing 
to performances on a second training case. 

Within the comparative limitations imposed 
by the present research procedures, repeated 
exposures to role playing showed little advan- 
tage over the single, impact experience. It 
was also found that Sensitivity and Employee- 
orientation improvements are differentially ef- 
fected. 

Received October 20, 1958. 
REFERENCES 
LawsHeE, C. H., Botpa, R. A., & Brune, R. L. 

Studies in management training evaluation: 

Scaling responses to human relations training cases 

J. appl. Psychol., 1958, 42, 396-398 
Lawsne, C. H., Brune, R. L., & Bona, R. A. 

What supervisors say about role playing. J. Amer 

Soc. Train., Directors, 1958, 12(8), 3-7. 

Mater, N. R. F. Principles of human relations. New 

York: Wiley, 1952. 
Srecet, S. Nonparametric 

McGraw-Hill, 1956 


New York: 


statistics. 





Journal of Applied Psychology 
Vol. 43, No. 5, 1959 


FOLLOW-UP ON THE VALIDTY OF A FORCED-CHOICE 
STUDY ACTIVITY QUESTIONNAIRE IN ANOTHER 
SETTING 


HOWARD MAHER 


University of Pennsylvania 


Schutter and Maher (1956) have previ- 
ously reported in this Journal the validation 
and cross-validation of a forced-choice study 
activity questionnaire. It was of interest to 
the present author to see if the instrument, 
validated and cross-validated on a state col- 
lege population, would carry over its validity 
to another, private university. Also, in the 
construction of the questionnaire, as described 
in the original article, the instrument was de- 
signed to have no correlation with scholastic 
aptitude test scores, thus to add more greatly 
to the multiple prediction of grade-point av- 
erage. It is of importance to see if this rela- 
tionship will hold using another test of scho- 
lastic aptitude and with another sample. 

Moreover, the population on which the test 
was originally standardized. was composed 
only of freshmen and sophomores. The pres- 
ent sample is composed of all classes save 
freshmen, and one may be interested in seeing 
whether the original finding of no differences 
in class scores holds for this sample. If so, 
separate norms would not be required for use 
with the different undergraduate classes. 


Procedure 


The Ss used were 189 sophomore, junior, and 
senior students of finance and commerce taking an 
introductory psychology course. Each student was 
asked to fill out the 30 block forced-choice scale 
(Schutter & Maher, 1956), using IBM answer sheets. 
These were machine scored, using the weights of 
minus three to plus three as established on the origi- 
nal Iowa State College (IOWA) validation groups. 

For the same Ss there were also available College 
Entrance Examination Board, Scholastic Aptitude 
Test (SAT) scores and accumulated grade-point av- 
erages (GPA). Since the ACE-L score was used in 
the I. S. C. investigation, it was decided to make for 
comparability of the scholastic aptitude measures by 
using the SAT verbal scores (V). 

To answer the validity and intercorrelation ques- 
tions (above), the Pearson product-moment coeffi- 
cients were computed among scores on the study ac- 
tivity questionnaire (SAQ), SAT-V, and GPA. Also, 
the zero-order coefficients were combined to obtain 


293 


the multiple prediction of GPA. Finally, ¢ tests 
were run among the mean scores of the classes to 
answer the third question above. 


Results 


Parres (1955), in an investigation of the 
SAT at this university, reported all finance 
and commerce students to have mean SAT-V 
scores of 496.7 (SD = 84.19). For the pres- 
ent sample, the mean is 499.7 (SD = 80.06). 
The difference is not statistically significant 
(¢ = .43), and the chosen sample would ap- 
pear comparable to the pertinent population. 

Table 1 shows the validities of SAQ and 
SAT-V and the intercorrelation between SAQ 
and SAT-V. (All are Pearson product-mo- 
ment coefficients.) Both SAQ and SAT-V 
contribute significant prediction of GPA (an 
r of .21 is significant at the .01 level). 

When the two predictor zero-order correia- 
tions are combined R, oz, = .62 where: 


1=GPA 
2 = SAQ 
3 = SAT-V 


The question as to whether SAQ scores 
differ significantly among sophomores, jun- 
iors, and seniors was handled by computing 
the ¢ tests among scores for these classes. 
None of the ¢’s in Table 2 reaches the 5% 
level of significance. 


Table 1 

Correlations among the Predictor Variables and the 

Criterion of Grade-Point Average and Means and 
Standard Deviations of the Three Variables 

SAT-V GPA 


M SD 


SAQ 
SAT-V 
GPA 


.03 A8 


aa” 
499.7 


3.2 


13.03 
80.06 
74 


*® The possible raw scores are from —41 to 38. 


b The highest possible GPA is 5.0. 





Howard Maher 


Table 2 


Means of SAQ Scores by Class and Significance of 
Difference of SAQ Class Score Means 


M 


Discussion 

The items of the SAQ were originally se- 
lected by comparing the responses of groups 
matched for scholastic aptitude (ACE-L 
score). The high criterion group’s members 
(overachievers) were, however, at least seven- 
tenths of a standard error of estimate above 
the regression line to predict GPA from ACE- 
L. The low group was selected from the area 
equally removed from and below the regres- 
sion line. | 

This procedure would tend to make for a 
desirably low correlation between SAQ and 
ACE-L. While the test underwent two vali- 
dations and a cross-validation in the IOWA 
situation, and, while the cross-validation sam- 
ple intercorrelation (SAQ vs. ACE-L) was 
only .07, there was speculation at the time 
that procedurally we might have “stacked the 
cards” in our favor and that “some increase 
in this low intercorrelation might be antici- 
pated for future samples where controls for 
ACE-L score are not exerted” (Schutter & 
Maher, 1956, p. 255). This apparently is not 
so. Table 1 shows that for this (different) 
sample and even for a different (although 
likely comparable) scholastic aptitude test the 
intercorrelation (r = .03) remains not signifi- 
cantly different from r = .00. The planned for 
relationship is thus demonstrated to hold up. 

While SAQ’s items were not selected to re- 
flect no differences among classes, the origi- 


nal finding of no significant score differences 
between freshmen and sophomores is seen to 
hold here in another institution and for dif- 
ferent classes, e.g., sophomores, juniors, and 
seniors (Table 2). Although mean SAQ 
scores drop off' from the sophomore to the 
senior level, none of the differences could be 
considered significant. It may be that there 
are differences in study activities among un- 
dergraduate classes. If so, the test does not 
reflect them, and, until such differences are 
demonstrated, the test may be considered 
suitable for use at all undergraduate levels 
without the necessity of providing separate 
norms. 

The paramount consideration must, of 
course, be that of validity. Since the test 
was keyed originally on compound probabili- 
ties (Baker, 1952) from two validation sam- 
ples and cross validated on a third sample, 
and since the original key was used here, no 
need was seen for the use of two (validation 
and cross-validation) samples in the present 
study. This procedure represents a cross- 
institutional cross-validation. Table 1 shows 
a comforting situation indeed, i.e., the test is 
valid in this different setting. The TOWA 
cross-validation r was .36. The present r is 
48. At first glance this would appear to be 
surprising, i.e., the repeat r apparently is 
higher than the original cross-validation. 
However, the fiducial limits at the 5% level 
for the original r of .36 are .18 to .52, and 
the repeat r of .48 is seen to be within these 
limits. 

In the IOWA setting, the combination of 
SAQ and ACE-L predicted GPA with R = .53. 
The better predictor in that instance was 
ACE-L, but SAQ raised the prediction by .12 
over ACE-L alone. In the present situation, 


however, the better predictor would appear to 
be SAQ (Table 1). 
cause SAT is used to select students for the 


Perhaps this occurs be- 
university. The multiple correlation is here 
found to be .62 with SAT-V raising the multi- 
ple prediction by .14. The combination of 
SAQ with the scholastic aptitude tests, while 
one or the other apparently may contribute 
more heavily in the different situations, re- 
sults in equally high multiple prediction in 
the two institutions. 





Validity of Activity Questionnaire in Another Setting 


Summary 


This study investigates the possibility that 
(unknown) institutional differences might af- 
fect the validity of a forced-choice study ac- 
tivity questionnaire (SAQ) validated at a 
state college and then applied in a private 
university. The findings are as follows: 

1. The cross-institutional cross-validation is 
48, significantly higher than r = .00. The 
original cross-validation was r = .36. 

2. In one setting (state college), the more 
significant predictor is the scholastic aptitude 
test; in the other SAQ, possibly in the latter 
as a function of selection of students using 
the scholastic aptitude test. The multiple pre- 
diction in the former situation was R 
in the latter .62. Thus SAQ is significantly 
valid and, together with the scholastic apti- 
tude test, gives significant multiple prediction 
of grade-point average in both situations. 


BR. 


295 


3. The study questionnaire, designed origi- 
nally to have no correlation with scholastic 
aptitude score, holds its sougnt-for lack of re- 
lationship in this instance also. 

4. The original finding of no difference be- 
tween freshman and sophomore SAQ scores is 
repeated here among sophomore, junior, and 
senior students. 


Received November 13, 1958 


References 


Baker, P. C. Combining tests of significance in 
cross validation. Educ. psychol. Measmt., 1952, 
12, 300-306. 

Parres, J. G. Prediction of 
graduate schools of the University of Pennsyl- 
vania. Unpublished doctoral dissertation, Univer 
of Pennsylvania, 1955 

Schutter, Genevieve, & Maher, H. Predicting grade 
point average with a forced-choice study activity 
questionnaire. J. appl. Psychol., 1956, 40, 253 


257. 


success in the under- 





Journal of Applie 


d Psychology 
Vol. 43, No. 5, 1959 


195° 


THE USE OF IBM MARK-SENSE CARDS AS MULTIPLE- 
CHOICE PAPER-AND-PENCIL TEST ANSWER 
FORMS ' 


ERNEST MADRIL? 


Personnel Laboratory, Wright Air Development Center : 


Recent rapid advances in the field of elec- 
tronic computer technology have made most 
obvious the desirability of employing punch 
cards to record examinees’ responses to test 
items. The conventional methods of key 
punching test response data bottleneck and 
restrict the volume of test information that 
can be fruitfully employed in psychological 
testing. This paper is intended as a report of 
the experiences of the Personnel Laboratory, 
Wright Air Development Center in the use, 
on a world-wide scale, of mark-sense cards as 
multiple-choice answer forms. Within recent 
years several studies have been made relating 
to the use of mark-sense cards in the admin- 
istration of tests. Deemer (1948) discussed 
the use of mark-sensing in large-scale testing 
from the point of view of centralized admin- 
istration and control of such test administra- 
tion. He was concerned principally with the 
mark-sensing of test scores achieved by the 
examinees. Appel and Cooper (1955) dis- 
cussed the use of mark-sense cards as test 
answer sheets, but again from the point of 
view of centralized control and administration 
of psychological tests. A _ statistical report 
prepared by the Educational Testing Service 
(1956) indicated that the test performance 
of examinees using conventional answer sheets 
did not differ significantly from the test per- 
formance of examinees using a new type an- 
swer card similar in size to the familiar IBM 
punched card. 

The United States Air Force program of 


assessing its airmen’s on-the-job proficiency 
began in 1952 and continues as a decentral- 


ized test administration, and a centralized 


1 This report results from work done under ARDC 
Project 7717, Task 17131, sponsored by the Person- 
nel Laboratory, Directorate of Laboratories, Wright 
Air Development Center. 

2 The invaluable assistance given by William S. 
Harris in planning and programing is gratefully 
acknowledged. 


296 


scoring, test analysis, and research program 
(Gilhooley, 1956). Currently the Personnel 
Laboratory employs about 400 Airman 
Proficiency Tests to classify approximately 
200,000 airmen by skill levels yearly. Ap- 
proximately one-third of this block of tests 
is administered each month at an average of 
800 different installations, during a three- 
month cycle of testing. Four such cycles of 
testing are conducted yearly on an Air Force- 
wide scale. Personnel classification require- 
ments, accounting, and public relations dic- 
tate that test results be made available to the 
interested activities with the least possible 
delay. Initially, conventional IBM test scor- 
ing machines and desk calculators were used 
in test processing and evaluation. Gradually 
techniques requiring the use of IBM account- 
ing equipment were developed and adopted. 
Presently, IBM mark-sense cards, in booklet 
form as shown in Fig. 1, are employed in the 
Airman Proficiency Test Program. The 
mark-sense positions for the possible item 
responses are in letter form, such as those 
suggested by Morton, Hoyt, and Burke 
(1955). 

The mechanics of the centralized test proc- 
essing portion of the program follow in this 
sequence: 


i. 
2 


a. 


Receipt and logging of test records. 
Reproduction of mark-sensed cards and 
key punching of selected information identi- 
fying the examinee. 

3. Editing of reproduced cards and general 
ordering of cards by test identification. 

4. Test scoring and preparation of fre- 
quency distributions and summary statistics. 

5. Linear transmutation of test scores and 
preparation of comparative summary sta- 
tistics. 

6. Preparation of rosters of test results by 
testing ‘installation. 

7. Mailing of test results to originating ac- 





IBM Mark-Sense Cards as Answer Forms 


tivities and summaries of test results to Head- 
quarters, United States Air Force and to the 
respective major air commands such as the 
Strategic Air Command and Air Defense 
Command. 


Test item analyses, research, and evalua- 
tion, together with test booklet distribution 
and the dissemination of test administration 
instructions, overlap the operations listed 
above. 

Completing the cycle of operations required 
an average of seven weeks during the early 
stages of the program. Presently, through 
the medium of mark-sense cards, conventional 
IBM type electrical accounting equipment, 
and an IBM Type 650 electronic data proc- 
essing system, the average time is only 28 
days. The use of an electronic computer has 
reduced significantly the manual labor usu- 
ally involved in large-scale test processing, 
and the operating personnel requirements for 
test scoring and analysis have been slashed 
by better than 50%. 

An examination of Fig 1. reveals that the 
examinee identifying Form Card 1 is flexible 
and permits the entry of varying types of 
personal data by the examinee, which can 
either be key punched or mark-sensed in 
columns reserved on Card 2. Card 2 also 
provides for the mark-sense entry of certain 
identifying information which is required for 
test scoring and statistical analysis. The re- 
verse side of Card 2 and both sides of Card 3 
provide four response positions each for a 
total of 150 test items. The item response 
cards also contain two rows each upon which 
spaces have been reserved for centralized edit- 
ing and coding. These positions are employed 
when examinees omit responses to test items. 
Such omissions are arbitrarily coded as wrong 
responses. These “reserved” mark-sense posi- 
tions can be used as a fifth alternative re- 
sponse position by a minor modification of the 
card design. 

The cards now used by the Airman Pro- 
ficiency Test Branch of the Personnel Labo- 
ratory are intended for unspeeded tests, but 
similar cards are adaptable for speeded tests 
as well. Space for more than 150 test items 
can be provided by adding the necessary cards 
to the test booklet or by using additional 


297 


booklets. All cards of the booxlet are pre- 
numbered and prepunched with the same 
7-digit serial number and a 1-digit card num- 
ber. This facilitates collating operations dur- 
ing mark-sense reproduction and later in test 
scoring and analysis. 

Punching of mark-sensed identifying in- 
formation and test answers is accomplished 
through an IBM Type 519 document origi- 
nating machine equipped with a_ half-time 
emitter, twenty-seven 12-4 and 5-9 column 
splits, and a 60-column multiple punch blank 
column detection device, and a punch offset 
stacker. The column splits and half-time 
emitter enable the offset reproduction of the 
first row of 25 item responses from the second 
row of each block of 50 items. This provides 
for a single answer punch in each of 50 card 
columns of the reproduced cards (Fig. 2). 
Each reproduced card also carries the control 
number, card number, test identity, and a 
5-digit code identifying the installations to 
which the results are to be remitted. Three 
test answer cards are produced from the two 
mark-sensed cards. Each card reflects the 
responses of the examinee to each of 50 test 
questions. As reproduction of the mark- 
sensed answer cards proceeds, selected identi- 
fying information from Card 1 is key punched. 
This step is required only because the Air 
Force needs to identify examinees by name as 
well as service number. Were alphabetic in- 
formation not required, or if it could be trans- 
ferred to punched card form by optical or 
similar electrically operated reading devices, 
key punching could be eliminated for the 
Airman Proficiency Test Program. 

Test scoring and the preparation of fre- 
quency distribution tables and summary sta- 
tistics, singly for the given administration 
period and cumulatively for previous admin- 
istrations of the same editions of the tests, 
are accomplished on the IBM Type 650 di- 
rectly from the mark-sensed reproduced cards 
at the rate of 1,800 raw scores an hour. The 
numerous scoring keys for the many tests are 
stored on the IBM Type 355 RAMAC. At- 
tained raw scores and test identification are 
stored, during the scoring operation, on mag- 
netic tape using IBM Type 727 tape drives. 
The IBM Type On-line 407 is employed dur- 
ing scoring to record selected identifying in- 





Ernest Madril 


ar 


(PRINT ALL INFORMATION) AIR FORCE PERSONNE, TEST PROGRAMS — (PRINT ALL INPORMATION) 0446 
\OCNTIF YING mFORMATION 





conten, sumeen 
FIRST NAME MIDDLE WiTiag 2. Pay Grane 5. PREFin ah FORCE SERVICE NUMOTR 

















SEE TEST BOOHLET COVER FOR SEQVIRES /eroemeTon 





Ce varsc Rar vat sumece |@. Af PRT EOTION DATE op ay! pe 
4 @00nLET cory sumone TEST AFSC Seen out y Tooays Dare ace O 
pay 4 MONTH 5 YEAR ,  «wonTH | wean ware 


' ' $ ‘ rewace () 

i] ! ! i 

(PLL (6 ITEMS 1) AWD 12 ONLY WHEN INSTRUCTED BY TCO) (EXAMINEES 00 WOT WRITE 16 ITEMS 13,14, 480 ss) 

PRIMARY AFSC AND SHREDOUT 8/0 | OUTY AFSC AMO SHREDOUT i 
12. 



































u 





! 
. ” ™ 
USE SPACES GELOW ONLY WHEN GIVEN SPECIAL INSTRUCTIONS 
“. Bare OF BIRTH ‘7. LS 19. 20. 
oav wont yaar 


Sau =e AS Im DCH 0446 . 





























‘ => — 
Noclaclacla>clog C()2c (=k) >cNac)acoc)aclacpacpoc)aclaclac)aclac)> 
|>c|>cl sc|>Cc|> El acl aqc| aC | ac] SE | Sci Sc | Se | Sci sc | oc | ac] sc] 3c | Sc] Sc] sc|ac}> 
22¢€2323c29¢2> E2222 96 232 3236 292 3c 23 23C23C 232 C232 ac 23€23¢23¢23c2> 
35¢3}9¢}3C335C3> E32¢3><5033 ©3236 39CJIC JIC IDCJDC Joc 3dcJ3cjac33cJICJoc3acjo 
4oc4oc4ac4ac4> hoe 4a CAE 4c hoc 4a 4oc4oc4ochacg ac 4ac4ac4chachoc4d 
PHIGICAH|2<§> Ceeessa SHA Hochachac§oc§ac§achac§ schac§ac§mchac§ach> 
§>chochochach> Cha b> HAAG ch rchrc hr chlchHochoch cf och McHhochochacf§> 
T2<T2<T2CT3¢1> | CSCI CI 3C7 C7 3CTIETDCTICTICTICT IC] aC] SCT 3cTacTacT> 
§2<8 28 3ch3C8> BSP > 8383838 89CH3Ch3C 838 cH 38 ><8 5h >c$3c 8D 
F92c9>c9>c9>c9> ISGP — 9k $3939 93c9dc9sc$sc93939 3<93c95c$3c 95 / 
— rtf ae. se hc Ue he ee 


bay aT wie | eee wet ae : : Fd 
; i : s 8 ‘ : : § i 
a 4 a 




















‘ 
! 


| o*e6 | 2 


EE ee a s 1 HM 4 








<> <> <> a 
bE 3ce5¢ 3c 3c 5 
OO OD OD AD aD 


3 & s 3 





Q.3riQQaqnae 
cc ococ sc 
ICO ON OD CO CD 
coc 
, 


Rie ee ae 




















“e7 76 MEXT COLUM €O TO CARD 4 FOR HOMBER 10! 








1. Mark-sense test answer booklet. 





IBM Mark-Sense Cards as Answer Forms 


/ 


t 
COOOPOCKCOHOK OOO MOOG HOHHOOOHOOKHOHH HOODOO OOOOH HOKOOHOHHONOOHHHHHOH OOOOH OOODD 
4s 
11 





TEPCUNVOKDEKNEBANRUKMSKBNNAHBHNVYKHKVBHOHUONKKEO ABN VUKSSHBHHHHOKHBHBHHNNNKHSKN NBDE 


0 
i ET TERRI TERRE] | 11 TREREREREEEREREREEREREREEE) TRREERERER] Th) TREE | THT TREE 
2222222222Q022222222222222222222222222222222222222220822282222222828222282222222 
33333333303333333333333333333333333333333333333333333383330383083383383333333393 
PEER! C) CECECCECEEECESS! COOOCOCOCCCCOCCOCCCOCOCCOESES COC! | Pel PeCererrl Feet! Pereer 
SSSHSSSSSSSSHSHSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSS 
CEGECEEESEEEECCECECCCECCCCMCCCCCRCCCCHC CCR CCRCCCMRCCCCECEGCECEEGECEEESEGCESE GEES 
777777779977779927707777807770707878877 7077878787 99999999799777779777797997779007 
SERRE RELEREKRESHseeSKee eee sEMeceHssssHssHssebseecesscssesccssessesssessesesess 


a8 
1? 
7 


auco 508 





S99S9Mssgggssssaaaggass ge MS MRssa399999heass9999999 
912345 67 6 9 WH 1210 615 6) Oe DP IMMBANAARHVUMNBEKR 


VEBMH AHHH BEDH 


$99998999999999999999999999999 
52 52 94 SS 56 57 50 Se 60 61 62 62 Oe 6S OO) OT TaD 


” i) ” wn BRAN ew 








Fic. 2 


formation and raw scores of examinees attain- 
ing unexpectedly low or high scores, and for 
the listing of frequency distributions which 
are subdivided into as many as. ‘vive sub- 
populations of examinees by test. The sum- 
mary statistics are punched to provide for 
variable listing as required. 

In test scoring, the examinee’s answers are 
positioned on the lower accumulator—a 10- 
digit word at a time—so that when the “key” 
word is added algebraically, the correct re- 
sponses selected by the examinee are con- 
verted io the digit 5. This is followed with 
the treatment of each item response indi- 
vidually, thus: A digit 5 is added into the 
high order position of the lower accumulator 
which causes the value of that item response, 
if correct, to equal 10 and carry into the 
upper accumulator. The high order digit in 
the Jower accumulator becomes zero. By re- 
setting the upper accumulator, shifting the 
lower accumulator left one position, and test- 
ing on the upper accumulator for non-zero, 
the machine can determine whether or not the 
item is correct. If the upper accumulator is 
zero, then that item response can be identi- 
fied as correct. An accumulation by tally of 
such right responses during iteration of the 
above routine, for as many items and words 
as necessary, will yield a total score. This 
procedure permits differential item weighting 
and selected item scoring through programing. 

Following the test scoring operation, a table 
of the statistical factors (required for linear 


Reproduced test answer card 


transmutation of scores) and the program are 
loaded into general storage. The key punched 
identification cards (taken from Card 1) are 
read, test identification and score records 
are taken from tape and collated with the 
identification card data. The raw scores are 
transmuted, frequency counts by variable 
class intervals are prepared by test and other 
categories, a master tape record is written, 
and master cards are punched at the rate of 
6,000 standard scores an hour. As the proc- 
essing of each test is completed, comparative 
frequency counts by variable class interval 
and by major air command are prepared and 
listed through the On-line 407. 

The total operating time for test scoring, 
preparation of frequency distributions, statis- 
tical summarizing, production of test record 
cards and master tape records on the IBM 
650 tape RAMAC system for approximately 
16,000 examinees is in the range of 13 to 18 
machine hours. One operator needs to be 
present in the machine room, but only for 
insuring that cards are fed into the IBM Type 
533 card reader and that the tape drives are 
properly selected and set. This may be con- 
trasted with conventional: test scoring, key 
punching, and preparation of frequency dis- 
tributions employing the usual electrical ac- 
counting machinery, desk calculators, etc., for 
statistical purposes. The time and manpower 
differences are obvious. 

The only errors which have been experi- 
enced in test scoring in the four months dur- 





Ernes 


Table 1 


Test Performance of Examinees Using Conventional 
Answer Sheets Compared with Examinees 
Using Mark-Sense Cards on 315 
Tests by Mean Scores * 


Number 
of 
Answer Forms Compared Tests Proportion 
Conventional answer sheet 
means higher 48 
Mark-sense card means higher .36 
Means equal 49 16 


1.00 


Total 315 


* Airman Proficiency Tests administered Air 
during March—August 1958 to 96,720 airmen 


Force-wide 


ing which the IBM 650 has been used have 
resulted from examinee failure to properly 
identify the test which was administered. This 
error rate has been less than .5%. The reject 
rate on the IBM 519 resulting from inade- 
quate mark-sensing by the examinee or from 
item omission is estimated at 4% by volume. 
This is far less than the reject rate on stand- 
ard answer sheets which could not be scored 
by conventional IBM scoring machines. 

The change, from the use of conventional 
answer sheets and scoring procedures to mark- 
sense cards and electronic computing equip- 
ment for the Airman Proficiency Test pro- 
gram, was made on the assumption that 
examinees would do as well or better using 
the cards as when using the answer sheets. 
A rigorous empirical test of this hypothesis 
was not feasible due to the administrative 
restrictions and the need to continue pro- 
ficiency testing without interruption. There 
were, however, two probably similar popula- 
tions administered the identical tests at differ- 
ent times. One group, the earlier one tested, 
used conventional answer sheets and the sec- 
ond group tested used the mark-sense cards. 
All conditions of eligibility, test control, and 
administration remained the same excepting 
the instructions to the examinees for marking 
their respective test answer forms. There is 
no apparent reason to believe that the per- 
tinent characteristics of the two populations 
may have differed in ways other than at ran- 
dom. Accordingly, a test of the null hypothe- 


t Madril 


sis that the change in answer forms would 
result in inferior test performance by the 
group having used the mark-sense cards seems 
appropriate. Three hundred fifteen different 
tests were administered to as many pairs of 
examinee populations. The raw score means 
attained by the paired populations each were 
compared. One hundred sixty-three of the 
tests based on the use of answer cards yielded 
means equal to or lower than those based on 
conventional answer forms. The standard 
error of the observed proportions is .74. A 
divergence of .02 in favor of equal or lower 
means might be expected to occur by chance 
27 times in 100. Therefore, it is concluded 
that the use of mark-sense cards as multiple- 
choice test answer forms does not adversely 
affect the test performance of Air Force per- 
sonnel administered Airman Proficiency Tests. 
The conclusion is further confirmed by the 
distribution of mean scores resulting from the 
two administrations of the tests. Only 19 
of the 114 lower means of the mark-sense 
card populations were below 8 score points of 
their respective answer sheet population 
means. 

In summary, the transition from the con- 
ventional multiple-choice answer sheets and 
test scoring to mark-sensing and scoring 
through the aid of electronic computers is 
feasible and less expensive, when processing 
large volumes of test data, than conventional 
methods. Outside of reduction in costs, it is 
of greater importance to consider the sizable 
increase in data that readily become available 
for analysis and evaluation. This augmenta- 
tion in available data should result in the de- 
velopment of information essential to the 
improvement of paper-and-pencil tests con- 
structed for the Air Force. 

As optical reading technology develops, the 
need for key punching alphabetic information 
will be further reduced. It is conceivable 
that if all examinees were to be provided with 
plastic identifying plates embossed with the 
necessary fixed alphabetic and numeric in- 
formation such as name, date of birth, Social 
Security number, the plates could then be 
used in recording, such information in suitable 
form for the preparation of punched cards by 
optical readers. The use of optical readers 
will further enhance the employment of mark- 





IBM Mark-Sense Cards as Answer Forms 


sense cards in the future, in that the readers 
can be made more sensitive and can have 
greater discriminating power than the usual 
devices which rely on conductive rather than 
photoelectric sensing. 


Received November 20, 1958. 


References 


Appel, V., & Cooper, G. A refinement in the use of 
mark-sense cards for test research. J. Amer. Sta- 
tist. Ass., 1955, 50, 557-560. 


301 


Deemer, W. L., Jr. The use of mark sensing in a 
large scale testing program. J. Amer. Statist. Ass., 
1948, 43, 40-52. 

Educational Testing Service. A study’ of a new type 
answer sheet. Statist. Rep., 1956, No. 56-23. 

Gilhooley, F. M. Proficiency test development and 
research for the Airman Career Program of the 
United States Air Force. Amer. Psychologist, 1956, 
11, 547-553. 

Morton, Mary A., Hoyt, W. T., & Burke, L. K 

A new type of test answer sheet. Amer. Psycholo- 

gist, 1955, 10, 572. 





Journal of Applied Psychology 
Vol. 43, No. 5, 1959 


RELATIONSHIPS BETWEEN PERSONAL AND SOCIAL 
DESIRABILITY SETS AND PERFORMANCE ON 
THE EDWARDS PERSONAL PREFERENCE 
SCHEDULE 


ALFRED B. HEILBRUN, JR. anp LEONARD D. GOODSTEIN 


State University of Iowa 


The relationships between desirability fac- 
tors and probability of endorsement of self- 
descriptive statements on personality ques- 
tionnaires have aroused considerable research 
interest. Investigations of the Minnesota 
Multiphasic Personality Inventory (Hanley, 
1956; Rosen, 1956) and the Edwards Per- 
sonal Preference Schedule (Edwards, .1953) 
have demonstrated a high positive correlation 
between the judged social desirability (i.e., 
the perceived desirability of a given trait 
in others) of a test statement and its selec- 
tion as self-characteristic. Edwards (1954) 
matched pairs of statements for social desir- 
ability in developing his test to minimize the 
importance of social desirability as a response 
determinant. 

There has been some recent evidence that 
a social desirability response set may not ac- 
count for all the response variance attribut- 
able to desirability. Heilbrun (1958) found a 
significant relationship between need scores 
on the Personal Preference Schedule (PPS) 
and the judged personal desirability of these 
needs, where this set was based on the per- 
ceived desirability of a given trait in oneself. 
Since there was evidence (Navran & Stauf- 
facher, 1954) that need scores on the PPS 
are uncorrelated with judged social desirabil- 
ity of the need, it was hypothesized that 
personal and social desirability sets might 
produce different test-taking behaviors. Good- 
Heilbrun (1959), in evaluating 
the relationship between social and personal 
desirability values of the individual PPS 
statements, found a very high correlation 
(r = .90) between the two sets of values. 
While the extent of the correlation limited 
any independent variation, a significant dif- 
ference between social and personal statement 
values was demonstrable when only state- 


stein and 


302 


ments associated with judged-low personally 
desirable needs were considered. No differ- 
ence between personal and social desirability 
values was found for the high personally de- 
sirable need stateménts. These authors sug- 
gested that Edwards’ procedure of matching 
items by social desirability values might be 
less effective with statements measuring 
sonally undesirable needs to the 
Ss assume a personal desirabiliy # ing 
set. 

Edwards (1953) has reported posi- 
tive correlation (r = .87) between social de- 
sirability values and probability of statement 
endorsement. In ar unpublished study, Heil- 
brun and Goodstein found a similar relation- 
ship (r = .90) between personal desirability 
values and probabilit Each 
comprising an item on 
the PPS have both a personal and social de- 
sirability scale value. 


ton 


i endorsement. 
of the two statement 


Because the statement 


pairs on a number of items are not perfectly 


matched for their desirability values, com- 
binations of high and low values are possible. 
Indeed, it is possible to find items on the PPS 
where the higher personal and social desir- 
ability values are associated with different 
statements in the item. If social and per- 
sonal desirability sets operate in the same di- 
rection as response determinants, then their 
relationship to the probability of endorsement 
should be greater than when they operate 
in opposite directions. Frequency of endorse- 
ment thus offers a criterion against which the 
operation of desirability sets may be evalu- 
ated. The purpose of the present study was 
to investigate the hypothesis that a personal 
desirability set affects performance on the 
Edwards PPS independent of a social desir- 
ability set. 





Desirability Sets and Performance on EPPS 


Table 1 


Data Concerning Statement Endorsement in Three Groups of Items (V 
Selected from the Edwards Personal Preference Schedule 


Mean Difference in 

Paired Statement 

Social Desirability 
Values 


Group 1 
Group 2 
Group 3 


Method 
Subjects 


A sample of 248 undergraduate students (166 males 
and 82 females) enrolled in an introductory psycho- 
ogy class at the State University of Iowa were ad- 
ministered the Edwards PPS under standard condi- 
tions. 


Procedure 


Social desirability scores, derived from pooled judg- 
ments and converted into arithmetic values by a 
successive interval scaling method described by Ed- 
wards (1957), were available for each statement in 
the PPS.1 Personal desirability values, derived in 
the same fashion & Heilbrun, 1959), 
were also available. 

Three groups, each including 20 items (i.e., pairs 
of statements), were then selected from the PPS ac- 
cording to certain criteria. Group 1 consisted of 
those items showing the maximum difference in the 
social and personal desirability values of the state- 
ment pairs and in which the higher social desirability 
value was associated with one statement in a pair 
and the higher personal desirability value with the 
opposing statement. 

Group 2 consisted of 20 items matched precisely 
with Group 1 items for the magnitude of difference 
in the social desirability values of the two state- 
ments. Group 2 items differed from those in Group 
1 in that the higher social value and the higher per- 
sonal value were both associated with the same state- 
ment in an item pair. In Group 2 both types of de- 
sirability values would lead to the prediction of the 
same and more highly desirable response in a pair, 
whereas in Group 1 these values would predict op- 
posing responses in the pair. Since differences in so- 
cial desirability values for statement pairs in Groups 
1 and 2 are identical, less endorsement of higher so- 
cially valued statements in Group 1 could then be 
attributed to the personal value differences which 
predict selection of opposing statements 

Group 3 consisted of 20 items which showed the 
maximum difference between the social and personal 
desirability values of the statement pairs and where 


1The authors wish to thank Allan Edwards for 
making these unpublished statement values available 
to them. 


(Goodstein 


Mean Difference in 
Paired Statement 
Personal Desirability 
Values 


0.76 
0.52 


Mean Number of 
Higher Socially 
Valued Statements 
Endorsed 


9.50 
10.81 


1.32 13.16 


the higher social and personal values were associated 
with the same statement in a pair. The expectation 
here would be a stronger tendency to select the re 
sponse having the higher social desirability values 
than would be found in either Group 1 or 2. The 
analysis of Group 3 response selection is not crucial 
to the hypothesis under investigation but is included 
since it provides an estimate of response selectivity 
under conditions where desirability factors should be 
most influential. 

The systematic influence of a response set based 
upon position or order was avoided, since half the 
responses predicted by the higher social desirability 
value were first statements of the pairs 
were the second statements. 


and half 


Results 


The basic assumption underlying the ex- 
perimental predictions in the present study is 
that Ss will tend to select the item statement 
showing the higher social desirability value. 
This was confirmed by a preliminary analysis 
of statement endorsement. Out of the 204 
PPS items (eliminating six items with identi- 
cal social desirability scores for both state- 
ments), Ss selected an average of 114.07 
higher socially valued statements. This dif- 
fers significantly from a chance expectancy 
of 102 (¢ = 14.54, p < .001). 

The mean number of statement endorse- 
ments associated with higher social desirabil- 
ity values in Groups 1, 2, and 3 is presented 
in Table 1. Means for Groups 1 and 2 differ 
significantly from each other (¢ = 5.33, p < 
.001) in the direction of fewer endorsements 
of higher socially desirable statements in pairs 
when personal desirability value differeneces 
would predict selection of the opposing re- 
sponse. 

It can be seen in Table 1 that the average 
endorsement of higher socially valued state- 


ments in Group 3, in which paired statement 





304 


differences are large but in the same direction 
for both desirability values, exceeds that for 
both Groups 1 and 2. The difference in mean 
endorsement between Groups 1 and 3 is 
highly reliable (¢ = 13.79, p< .001), as is 
the difference between Groups 2 and 3 (¢ = 
9.89, p < .001). 
Discussion 

The finding that Ss showed less of a tend- 
ency to select more highly socially desirable 
statements as_ self-characteristic when _per- 
sonal desirability values would predict en- 
dorsement of the opposing statement of the 
pair than when personal values would predict 
the same response lends support to the no- 
tion of some independent contribution of a 
personal desirability set to statement endorse- 
ment on the PPS beyond that attributable to 
a social desirability set. 

It is difficult to assess the practical signifi- 
cance of the finding that personal desirability 
values are related to performance on the PPS 
but were not considered in the original state- 
ment matching procedure. Since the differ- 
ence between the endorsement frequencies in 
Groups 1 and 2 is not large in an absolute 
sense (an average of 1.31 fewer of the higher 
socially valued statements were selected out 
of the 20 in Group 1) and was obtained when 
personal and social desirability value differ- 
ences were maximized, it would appear that 
utilization of only social values for matching 
purposes does not represent a crucial flaw in 
Edwards’ attempt to minimize desirability of 
verbal statements as an important source of 
performance variance. 

The matching procedures also may be ques- 
tioned in that the statements in several of 
the items are not closely matched and appar- 
ently allow desirability to play a major role 
in statement endorsement. The Ss in this 
study averaged endorsing more than 13 higher 
socially valued statements in a set of 20 such 
items (Group 3). This finding is in agree- 
ment with that of Borislow (1958, p. 27), 
who concluded that the PPS can be faked un- 
der personal and social desirability instruc- 
tions, but he modified this by adding that the 
Schedule “is not greatly susceptible to the in- 
fluence of fakability in terms of choice of so- 
cially desirable items, per se.” Although the 


Alfred B. Heilbrun, Jr. and Leonard D. Goodstein 


number of items included in Group 3 in the 
present study is not great considering the 
total of 210 items from which scale scores 
are derived, a closer inspection of these 20 
items suggests that the failure to match 
closely could have a marked effect on cer- 
tain individual scales. It was found that the 
20 statements having the higher desirability 
values were fairly well spread with respect to 
the need scale which each measures on this 
test (Endurance, 4; Intraception, 3; Succor- 
ance, Dominance, Order, Affiliation, Hetero- 
sexuality, 2 each; Exhibition, Change and 
Nurturance, 1 each). Thus the failure to 
match closely should tend to inflate scores on 
at least 10 scales, with the Endurance and 
Intraception scales being affected most. How- 
ever, when the statements having the lower 
desirability scores in the 20 pairs were ex- 
amined, the results were much more striking 
(Aggression, 9; Exhibition, Dominance, and 
Abasement, 3 each; Autonomy, 2). The find- 
ing that in at least 9 of the 28 items (32%) 
in which statements reflecting need Aggression 
appear these statements are mismatched and 
lower in desirability value certainly suggests 
caution in the utilization of the Aggression 
score. Even more caution is suggested by the 
finding that in 14 of the remaining 19 items 
involving Aggression statements, these state- 
ments have lower social amd personal desir- 
ability values, in three items the Aggression 
statements have lower social or personal de- 
sirability values, and in only two items do 
the aggression statements have higher per- 
sonal and social desirability values. Though 
further specific checks for systematic match- 
ing flaws is beyond the scope of this investi- 
gation, it is worthwhile noting that the intro- 
duction of error into any scale on the PPS by 
improper matching procedures is bound to in- 
troduce error into the remaining scales be- 
cause of the forced choice nature of the test 
(i.e., the failure to select one statement in a 
pair as self-characteristic automatically as- 
signs a score to another scale). 


Summary 


This- study was concerned with the hy- 
pothesis that a personal desirability set op- 
erates somewhat independently of a social 
desirability set in determining response se- 





Desirability Sets and Performance on EPPS 


lection on the Edwards Personal Preference 
Schedule. To test this hypothesis 248 col- 
lege Ss were administered the PPS. 

Three groups of 20 items each were selected 
from the PPS: one group included items hav- 
ing a maximum difference in the social and 
personal desirability values of the paired 
statements in each item and for which the 
two types of desirability values predicted 
selection of opposing statements (with the 
higher socially valued statement predicted for 
endorsement); the items in the second group 
were precisely matched with those in the first 
group for between-statement differences in so- 
cial desirability values, but personal values 
predicted endorsement of the same state- 
ments; the third group of items were those 
in which there was a maximum difference in 
the personal and social desirability values of 
the paired statements but in which both 
values predicted the selection of the same 
response. 

The hypothesis of some independent effects 
of personal and social desirability sets upon 
response endorsement was supported by the 
finding that significantly fewer of the higher 
socially valued statements were endorsed 
when personal values predicted endorsement 
of the opposing statement than when the per- 
sonal values predicted selection of the same 
response. 


When the differences between personal and 
social desirability values of paired statements 
were maximal and both values predicted the 
endorsement of the same response, Ss aver- 


305 


aged selecting the higher valued statement 
over 13 out of 20 times. Inspection of these 
20 items indicated that specific scales, pri- 
marily the need Aggression Scale, may be 
especially vulnerable to a desirability set be- 
cause of mismatching and should be inter- 
preted with caution. 


Received November 21, 1958. 


References 


Borislow, B. The Edwards 
Schedule and fakability. 
42, 22-27. 

Edwards, A. L. The relationship between judged 
desirability of a trait and the probability that the 
trait will be endorsed. J. appl. Psychol., 1953, 37, 
90-93. 

Edwards, A. L. Personal Preference Schedule. 
York: Psychological Corp., 1954. 

Edwards, A. L. Techniques of attitude scale con- 
struction. New York: Appleton-Century-Crofts, 
1957. 

Goodstein, L 


Personal Preference 


J. appl. Psychol., 1958, 


New 


D., & Heilbrun, A. B. The relation- 
ship between personal and social desirability values 
of the Edwards Personal Preference Schedule. J. 
consult. Psychol., 1959, 23, 183. 

Hanley, C. Social desirability and responses to items 
from three MMPI scales: D, Sc, and K. J. appl. 
Psychol., 1956, 40, 324-328. 

Heilbrun, A. B. Relationships between the Adjective 
Check List, Personal Preference Schedule, and de- 
sirability factors under varying defensiveness con- 
ditions. J. clin. Psychol., 1958, 14, 283-287. 

Navran, L., & Stauffacher, J. C. Social desirability 
as a factor in Edwards Personal Preference Sched- 
ule performance. J. consult. Psychol., 1954, 18, 
442. 

Rosen, E. Self-appraisal, personal desirability, and 
perceived social desirability of personality traits 
J. abnorm. soc. Psychol., 1956, 52, 151-158 





Journal of Applied Psychology 
Vol. 43, No. 5, 1959 


SUBORDINATES’ PERCEPTIONS OF THE PRODUCTIVE 
ENGINEER 


ROBERT E. STOLTZ 


Southern Methodist University 


Of major importance in our attempts to un- 
derstand and predict the behavior of persons 
is our knowledge of how the individual per- 
ceives not only himself, but those others who 
occupy space in his particular psychological 
field. This report summarizes an exploratory 
study dealing with how young engineers in a 
unique student-subordinate role perceive the 
engineers they chose to term “productive.” 

There are at least three reasons why a 
study of this type is useful and needed. First, 
it should provide suggestions for hypotheses 
about the productive process in this field that 
may be further studied in more highly con- 
trolled and more specific inquiries. Second, 
it is planned that a future study will attempt 
to compare these data with similar data pro- 
vided by a sample of engineering research su- 
pervisors. This would enable one to deter- 
mine what discrepancies, if any, exist between 
these two different status levels. Third, the 
results are useful in their own right in giving 
us a better idea of what performances are 
seen as linked to the productive behavior of 
persons who serve as models for the subordi- 
nates in assessing and molding their own in- 
dividual behavior. 


Method 
Subjects 


The Ss used in this study were 80 male, third-year 
engineering students currently attending the coopera- 
tive training program of a southwestern engineering 
school. As part of his training, each S spends ap- 
proximately one-half of each full year working in 
industry. The Ss were enrolled in two sections of 
an elementary psychology course, forty students in 
each section. Age, major subject, and number of 
months engaged in industrial work as part of the 
cooperative training program were not significantly 
different between the two sections. The average 
number of months of industrial experience for the 
combined groups was 19.8 months. 


Instrument 


The instrument used in this investigation was the 
Productive Behavior Checklist developed by Stoltz 


306 


and described in detail elsewhere (Stoltz: 1958, 
1959). Briefly, this checklist consists of 250 state- 
ments taken from interviews with physical science 
and engineering research supervisors. Pages of the 
checklists were assembled in random order to mini- 
mize any consistent tendency for certain items to be 
rated differentially due to their serial position within 
the checklist. 
Procedure 

One section of the Ss was given copies of the 
checklist and asked to describe the behavior of the 
most productive engineer they knew by rating each 
item on a five-point scale according to how typical 
the behavior described by the item was of the par- 
ticular person they were rating. A rating of 5 for 
an item would indicate a very typical behavior. The 
second section of the Ss was also given copies of the 
checklist, but were asked to describe the least pro 
ductive engineer they knew. All of the Ss were 
cautioned to describe only one particular individual 
and not to describe productive or nonproductive en- 
gineers in general. They were also encouraged to be 
as fair as possible and indicate the person’s weak- 
nesses as well as his strong points 

The analysis of the individual items was made by 
computing the ¢ ratio between the means for each 
item of the productive and nonproductive set of rat- 
ings. Since the limitations of space preclude listing 
the statistics for each of the 250 items within the 
checklist, only the 30 most discriminating and the 
30 least discriminating items will be given here in 
Tables 1 and 2, respectively. 


Results 


One hundred and fifty-one of the 250 pos- 
sible ¢ ratios were significant at or beyond 


the .01 level of confidence. This high level 
of confidence was selected since it seems rea- 
sonable to assume that some amount of halo 
effect might tend to inflate the differences be- 
tween the items. This effect might be ex- 
pected in view of the values attached to the 
terms “productive” and “nonproductive” and 
due to the favorable wording of almost all of 
the items. While it is quite possible that 
some halo effect accounts for a portion of 
the remaining significant differences, it seems 

1A complete list of all the items, giving their 


means, standard errors, and ¢ ratios, may be ob- 
tained by writing the author. 





Subordinates’ Perceptions of the Productive Engineer 


Table 1 
Thirty Most Discriminating Items 


Item 


Is a clear thinker 


Is very efficient 

Can handle anything given him 

Keeps his mind on his work 

Does more than his share of the work 

Is not easily distracted from his job 

Has real interest in his job 

Has good attitude toward his work 

Has ingenuity 

Has good leadership qualities 

Comes up with new ways of doing things 

Does not go off on tangents 

Has more than casual interest in his work 

Can grow into a job 

Does not seem lazy 

Develops respect in other peopl 

Takes responsibility well 

Is a self-starter 

Is technically competent 

Has a store of information to apply to problems 5.49 
Can pick out important details in a problem 46 
Is orderly in his work 5.45 
44 
39 
30 
29 
28 
20 
15 


14 


Is conscientious 

Can organize the work of others 
Thinks of better ways of doing things 
Good, sound technical background 
Technical background is above average 


wn 


wu 


tn 


Can evaluate alternative approaches to problem 


momo 


Offers his share of creative ideas 
Directs work of others effectively 


wa 


Note.—Positive ¢ ratios indicate that the item was rated as 
more typical of productive than of nonproductive engineers 
With 78 df, at must exceed 2.64 to be significant at the .01 level. 


equally likely, in view of the design, that 
there are real differences in how the produc- 
tive person is perceived relative to the non- 
productive by the Ss. 


Discussion 


An analysis of the content of the discrimi- 
nating items indicated that a summary of the 
stereotype existing for the Ss might best be 
presented by considering their comments to 
fall into four major content areas. This clas- 
sification was arbitrarily made, and the cate- 
gories are not assumed to be mutually inde- 
pendent or exclusive. 


Intellectual Activity 


The productive engineer is seen as a versa- 
tile person, intelligent, with good analytical 
reasoning ability. This versatility is appar- 
ently restricted to activity within the engi- 
neering area, and it is not clear whether the 
subordinates consider it to extend into cther 
activities. The subordinates indicate that this 
intellectual activity is controlled by a sense 
of the practical problems involved in a task, 
and, while the productive engineer is willing 
to try unique approaches or attempt new 
methods in search of a solution to a prob- 
lem, the practicality of the solution is an im- 
portant determinant in his selection of an 
approach. Throughout their comments the 


Table 2 


Thirty Least Discriminating Items 


Item 
Does not hide behind his degree 0.00 
0.00 
0.15 
0.18 
Does not jump into an explanation 20 


Likes administrative work 
Does not fly off the handle 


\ctive in social groups in the company 


Is emotionally stable 20 
Is not gruff at times 23 
Is not cold and aloof 36 
Active in company sponsored activities 36 
May be curt sometimes 38 
0.40 
0.48 
0.49 
0.62 
0.64 
0.71 
0.83 
0.91 
0.92 
0.92 
0.92 
0.37 
0.56 
0.62 
0.73 
0.78 
1.02 
1.08 
1.08 
1.08 


Is at ease with others 

Can’t stand to be unsuccessful 

Is not bound by limits 

Does not worry about trivialities 

Difficult to get excited 

Willing to put in extra time 

Interested in some things outside of his field 
Does not offend others 

Is not irritable 

Does not hurt others’ feelings 

Wants recognition 

Has a need to be recognized 

Is intelligent but not the brightest 

Has a lot of outside interests 

Sometimes impatient with others 

May try to do everything himself 

Is not impatient with others 

Considers the problem of money 

Does not hurt feelings of others unnecessarily 
Has imagination 


Note.— Positive ¢ ratios indicate that the item was rated as 
more typical of productive than of nonproductive engineers. 
With 78 df, at must exceed 2.64 to be significant at the .01 level. 





308 


subordinates appear to make a distinction 
between the immediate task area, i.e., engi- 
neering problems, and activities that are pri- 
marily outside of the task area. This distinc- 
tion is marked in their comments regarding 
the producer’s creativity. The productive 
engineer is felt to be highly creative within 
his area of competence, particularly as re- 
gards his novel use of past experience, but he 
is apparently not seen as being creative in a 
more general sense. These views of the sub- 
ordinates agree in many respects with the 
findings of Harrison, Hunt, and Jackson 
(1955a, 1955b, 1955c) regarding the per- 
formance of mechanical engineers on a num- 
ber of common psychological tests. 


Motivation 


One of the most marked characteristics of 
the productive engineer is his tremendous in- 
terest in his work. More specifically, his in- 
terest in problems within the engineering field 
is seen as quite strong, but he is not perceived 
as more strongly interested in the company 
than the nonproducer. His interest in his 
work group is apparently seen as somewhat 
intermediate in intensity. Earlier research 
on the perceptions of physical science re- 
search supervisors indicated that the more 
highly motivated scientists might be more 
prone to aggressive attacks on fellow workers 
than the less motivated scientists. This hy- 
pothesis receives little confirmation from the 
subordinates. They indicate that the pro- 
ducer is highly persistent in his efforts to 
reach an objective, but he is not seen as a 
person who tends to verbally strike out at 
others and exhibit other signs of aggression 
more than the nonproducer. The evidence 
given by Roe (1951la, 1951b) indicating that 
strong intrinsic motivation and quite permis- 
sive expression of hostility were character- 
istic of successful physical scientists lends 
weight to the former hypothesis, but again 
does not appear to be confirmed. Perhaps 
the answer to this apparent contradiction of 
earlier results lies in the status level of the 
Ss used in this study. It might be hypothe- 
sized that the subordinates do not interpret 
aggressive responses directed toward them- 
selves by the productive persons as aggressive 
or as unwarranted, although similar remarks 


Robert E. Stoltz 


directed by nonproducers might be so inter- 
preted. The halo effect which was mentioned 
earlier might have operated to discount the 
importance of aggressive acts by the highly 
regarded productive engineers. There was no 
provision in the present study for testing 
either of these hypotheses. 

The producer is seen as making a number 
of distinctions concerning when and where 
this motivation can be expressed. The sub- 
ordinates see him as being quite willing to 
take work home, but no more willing than the 
nonproducer to put in extra time on the job 
(this presumably means within the physical 
confines of the plant). Other responses of 
the subordinates follow this pattern and indi- 
cate that the producers make a distinction 
between time, or perhaps effort, spent om the 
job and time spent at the job. The subordi- 
nates see the productive engineer as reacting 
negatively to the physical confines and re- 
quirements of the plant, but willing and ready 
to produce in a more self-regulated situation. 
The findings of Van Zelst and Kerr (1951, 
1952, 1954) regarding the belief in voluntary 
determination of deadlines as a character- 
istic of productive scientists support this per- 
ception of the subordinates. 


Personality and Social Factors 


The producer is seen as having a high de- 
gree of independence needs and _ initiative, 
again chiefly within the job area, and with 
a definite orientation toward accepting re- 


sponsibility. This should not be interpreted 
as describing completely what might be 
termed “lone wolf” behavior. The producer 
is not seen as a person who attempts to do 
everything himself, but as one who will seek 
out assistance and help from others. Again 
it appears that task orientation might be the 
major determinant of whether or not he will 
sacrifice independence for the support which 
he might need. Although the subordinates 
see the producer as accepting help, they do 
not seem to feel that he is any more willing 
than the nonproducer to seek help from the 
experts. The subordinates indicate, however, 
that the producer is probably more aware of 
his shortcomings and deficiencies than the 
nonproducer. A suggested hypothesis might 
be that the less task involved the person is, 





Subordinates’ Perceptions of the Productive Engineer 


the less he will seek support from others. 
Harrison et al. (1955a, 1955b, 1955c) have 
indicated that there is a tendency for me- 
chanical engineers to be somewhat authori- 
tarian in their own activities and for them 
to accept authoritarian solutions somewhat 
readily. Perhaps the better answer to the be- 
havior of the producers in this area will be 
some combination of hypotheses regarding 
acceptance of authority and ego involvement 
in the task. 

The subordinates do not appear to view 
the producer as any more likeable or agree- 
able than the nonproducer, but they do 
consider him to be somewhat more mature, 
although perhaps more aloof, than the non- 
producer. The nonproducers are seen as be- 
ing more “thick-skinned” than the producers 
and as being more likely to upset others and 
to be quick toward others. The content of 
these items suggested that what the subordi- 
nates might be describing were reactions of 
the nonproducers which interfered with the 
nonproducers’ task performance or which 
were expressions of the nonproducers’ own 
frustration. In order to clarify this problem, 
a number of the Ss were questioned regard- 
ing their responses to these items. The item 
regarding “thick-skinned” behavior was inter- 
preted by some of the Ss as referring to the 
unwillingness or inability of the nonproducers 
to perceive or accept attempts made to cor- 
rect or improve their performance. More- 
over, the responses of the nonproducers to 
these attempts, presumably in the light of 
their own perceptions of themselves as non- 
producers in an environment where produc- 
tion is highly regarded, tended to be “upset- 
ting” to the subordinates and others in the 
work groups. 

The producer is not seen as a particularly 
social person, and the data suggest that he 
might even be less likely than the nonpro- 
ducer to have an active social life outside the 
company group. 


Administrative Activity 


The subordinates see the producer as’ hav- 
ing the ability to capably administer his own 
work and the work of others, but as not bc 
ing particularly fond of administrative work. 
This might well be due to the belief, or fact, 


309 


as some of us feel, that administrative work 
is typically routine work. The subordinates 
point out quite strongly that the productive 
person does not like routine work, hence he 
probably does not look with favor on adminis- 
trative assignments. 

The producer is seen as being efficient in 
handling the work assigned to him, but the 
subordinates do not consider him as being 
less likely than the nonproducer to emphasize 
nonessentials or to get wrapped up in tech- 
nical details. Extreme rigidity in task be- 
havior does not seem to be a characteristic of 
the producers, as the subordinates point out 
that the tendency to “think of things as either 
black or white” is more characteristic of the 
nonproducers. 

The importance of communication in the 
behavior of the producer is indicated by the 
subordinates. Their comments suggest, how- 
ever, that they tend to evaluate oral com- 
munication more highly than written com- 
munication. 

The producer, as might easily be expected, 
is accorded a high degree of respect by the 
subordinates and is considered by them to be 
a person for whom they might like to work. 
The data suggest that this feeling of the sub- 
ordinates is much more akin to respect than 
it is to a general liking or feeling of warmth 
toward the producer. 

At this stage of inquiry, it is best to gen- 
erally regard these findings as indicative only 
of a stereotype of the productive engineer 
existing within our particular sample of sub- 
ordinates. Whether or not this stereotype is 
valid in the sense of accurately describing the 
behavior of engineers who are productive in 
terms of other, perhaps more objective cri- 
teria, remains to be seen. The stereotype 
probably is valid in the sense of its operating 
with the work situations to influence the be- 
havior of the subordinates and to influence 
the subordinate’s attempts to vary or alter his 
own behavior in order to reach a_ highly 
valued position. For example, from this 
analysis we might expect the subordinate to 
value skill in oral communication more highly 
than skill in areas of written communication. 
The subordinate may then attempt to rely 
on his verbal ability, particularly his ability 
to speak glibly in the Carnegie manner, and 





310 Robert 
reject or even oppose efforts to increase his 
ability to express himself in more formal, 
written efforts. The extent to which this 


value is shared by his superiors may well de- 
termine his own success or failure as an engi- 
neer in terms of a criterion of approval by 
superiors. 


Summary 


Eighty student engineers enrolled in the 
cooperative training program of a large south- 
western university described the work behav- 
ior of productive and nonproductive engineers 
using a 250-item checklist developed in an 
earlier study. Approximately 60% of the 
items within the checklist discriminated sig- 
nificantly between the descriptions of the 
productive and nonproductive engineers at or 
beyond the .01 level of confidence. A de- 
scription of the stereotype of the productive 
engineer as seen by the subordinates was de- 
veloped from an inspection of the responses 
to the items. Tables indicating the 30 most, 
and 30 least, discriminating items are given. 


Received November 24, 1958 


E. Stoltz 


References 


Harrison, R., Hunt, W., & Jackson, T. A. 
of the mechanical engineer: I. Ability. 
Psychol., 1955, 8, 219-234. (a) 

Harrison, R., Hunt, W., & Jackson, T. A. 
of the mechanical engineer: II. Interests 
nel Psychol., 1955, 8, 315-330. (b) 

Harrison, R., Hunt, W., & Jackson, T. A. 
of the mechanical engineer: III. Personality. 
sonnel Psychol., 1955, 8, 469-490. (c) 

Roe, Anne. A psychological study of physical sci- 
entists. Genet. psychol. Monogr., 1951, 43, 121- 
235. (a) 

Roe, Anne. Psychological tests of research scientists. 
J. consult. Psychol., 1951, 15, 492-495. (b) 

Stoltz, R. E. Development of a criterion of research 
productivity. J. appl. Psychol., 1958, 42, 308-310 

Stoltz, R. E. Factors in supervisors’ perceptions of 
physical science research personnel. J. appl. Psy- 
chol., 1959, 43, 256-258 

Van Zelst, R. H., & Kerr, W. A. Some correlates of 
scientific and technical productivity. J 
soc. Psychol., 1951, 46, 470-475 

Van Zelst, R. H., & Kerr, W. A. A further note on 
some correlates of scientific and technical produc 
tivity. J. abnorm. soc. Psychol., 1952, 47, 129 

Van Zelst, R. H., & Kerr, W. A. Personality self as 
sessment of scientific and technical personnel. J 


appl. Psychol., 1954, 38, 145-147 


Profile 
Personnel 


Profile 
Person- 


Profile 
Per- 


abnorm 





Journal of Applied Psychology 
Vol. 43, No. 5, 1959 


FACTOR ANALYSIS OF REPORTED MINOR 
PERSONAL MISHAPS ' 


J. D. KEEHN 


American University of Beirut 


While clinicians (Alexander, 1952; Dunbar, 
1943; English & Pearson, 1945; Rowson, 
1944) are generally of the opinion that cer- 
tain personalities are more likely to have re- 
peated accidents than others, some psycho- 
metricians like Webb (1956), Arbous and 
Kerrich (1951), and a number of other re- 
cent writers have argued that the concept of 
“accident proneness” has not been convinc- 
ingly demonstrated by accident statistics. 

Thus on the one hand it is claimed that 
“the accident prone individual is an impetu- 
ous person [who] harbors a deeply ingrained 
rebellion against the excessive regulations of 
his upbringing [and] has a strict con- 
science which makes him feel guilty for his 
rebellion” (Alexander, 1952, p. 214) and on 
the other that large scale Air Force and Navy 
studies “give strong evidence that accidents 
may not be predicted from preceding acci- 
dent behavior” (Webb, 1956). That is, some 
workers deny that accident likelihood can be 
predicted from past accident records, while 
others believe that a particular accident per- 
sonality can be described precisely because 
such predictions can be made. 


Large scale studies can be cited in support 
of either view and it is unlikely that extensive 
investigations of industrial, aviation, or high- 
way accidents will do much to clarify the ~ 


issue. Although there are many reasons why 
this should be so, the most likely one is that 
large scale studies concern themselves only 
with actual accidents which are serious enough 
to be reported and which occur only in a nar- 
row range of situations. Hence Webb (1956) 
is careful to say that aircraft pilots cannot be 
selected “on the basis of aircraft accident his- 
tories” although he leaves open the question 


1 This study was made possible by a grant from 
the Rockefeller Foundation to the Arts and Sciences 
Division of the American University of Beirut. Ac- 
knowledgment is made to the Rockefeller Foundation 
and to Emma Oshagan, Rita Tabourian, and Ihsan 
al-Issa who assisted in the collection and analysis of 
the data 


of the predictive nature of other, minor mis- 
haps. When it is noted that nonindustrial, 
home accidents were almost double industrial 
accidents in 1951 (National Safety Council, 
1952) the restrictive nature of specific indus- 
trial accident data is apparent. Similarly, 
near accidents where prompt action of one in- 
dividual averts injury to another do not usu- 
ally become incorporated into accident sta- 
tistics. 

That “nonspecific” accidents might predict 
aircraft accidents due’’to “pilot error” has 
been shown by Kunkle (1946). Thus per- 
sonal injuries like sprains, cuts, fractures, and 
dislocations were significantly related to fly- 
ing accidents in the group of pilots that he 
studied. On the other hand, mishaps. like 
falling down stairs, trapping fingers in doors, 
and driving accidents showed no such rela- 
tionship. It is possible, then, that there are 
certain classes of “personal” accidents and 
that some of these classes may be predictive 
of other accident classes in industrial, aircraft, 
or driving situations. The present study sets 
out to investigate the first of these possibili- 
ties in a Near Eastern cultural setting. 


Method 


A questionnaire containing 41 statements about 
minor mishaps was administered to 100 male uni 
versity students between the ages of 18 and 25 years 
Most of the Ss were Arabs. The questionnaires were 
filled out anonymously and individually Sy the Ss as 
part of a larger study in which they were paid for 
their services. University students rather than in- 
dustrial workers were used as Ss in order to over 
come the possibility that some Ss might falsify their 
responses through fear of jeopardizing their jobs 
Anonymous responses were requested as a further 
check against possible falsification. The question 
naire and instructions are shown in Table 1. 

Undecided and negative responses were combined 
and tetrachoric correlations between the items com 
puted by means of the tables of Chesire, Saffir, and 
Thurstone (1933). Items on which more than 80% 
of the responses fell into one or other category were 
excluded from further consideration owing to the 
unreliability of tetrachor) r’s when computed from 





J. D. Keehn 


Table 1 


Accident Index 


Will you please answer each question by putting a circle round ‘‘yes” or “no.”’ If you cannot make up your 
5 1 ; 3 P 3 


mind, circle the “?.’’ Work quickly and do not worry too long about the exact meaning of each question. 


are no right or wrong answers, and no trick questions. 
you can. 
1, Do you often seem to cut yourself 

when you use sharp things? 
Do you often bump into things and 
hurt yourself? 
Have you ever eaten bad food or acci- 
dentally drunk a poisonous liquid? 
Do you tend to make mistakes when 
you are writing? 
Have you ever accidentally torn a 
book or newspaper or similar object? 
Have you ever trapped your finger in 
a door? 
Do people tend to bump into you on 
the street ? 
Do you find that by the time you 
made up your mind over something 
it is too late? 
As a child did you always seem to be 
hurting yourself one way or another? 
Have you ever broken one of your 
bones? 
Do you tend to drop things and break 
them? 
Do you often burn yourself by touch- 
ing hot places? 
Have you ever burned your mouth by 
eating or drinking something that was 
too hot? 
Did you ever swallow a harmful ob 
ject as a child? 
Would you call yourself a careless 
person ? 
Are you the kind of person who al- 
ways seems to be knocking things 
over? Yes 
Do you think you are an unlucky 
kind of person? Yes 
Do you sometimes bite your tongue 
when talking or eating? Yes 
Have you ever been almost hit by a 
car or other vehicle? Yes ? No 
Do you often seem to be twisting or 
spraining your ankles or wrists? Yes ? No 


extreme cuts. Items 29, 34, and 37 were eliminated 
on this basis. The remaining table of intercorrela- 
tions was factored by Thurstone’s centroid method 
using highest column correlations as estimated com- 
munalities. 


There 


Remember to answer every question as accurately as 


21. 


22. 


Have you ever accidentally received 
an electric shock? 

Have you ever hit your finger acci- 
dentally with a hammer? 

Do you tend to spill things frequently ? 
Do your belongings seem to wear out 
quicker than you expect? 

Do you sometimes misunderstand 
what people are saying to you? 

Do you often tend to lose or misplace 
things? 

As you walk do you sometimes trip 
over things? 

Do you find it difficult to write neatly 
without making mistakes or marks on 
the paper? 

Would you say that you are the kind 
of person who often has accidents? 
Have you ever scalded yourself by, 
for instance, putting your hand in a 
hot liquid or putting your foot into a 
hot bath? 

Do you frequently bruise yourself? 
Do you find yourself sometimes for- 
getting things that you know very 
well? 

Have you ever fallen down stairs? 
Do you find difficulty in remembering 
which is the hot tap in your bath 
room? 

Have you ever mistaken the time 
after looking at your watch? 

Have you ever felt yourself in danger 
while swimming? 

Are you the kind of person who is 
frequently late for appointments? 
Do you have one or more scars on 
your body? 

Have you ever touched a hot stove or 
similar object by mistake? 

Do you tend to get ink on your fingers 
while you are writing? 

Do you ever find that people’s feelings 
are hurt by things you say? 


Results 


Most of the item intercorrelations were posi- 
tive and 102 were significant at the .01 level 
where about 7 would have been expected by 





Factor Analysis of Reported Personal Mishaps 


chance. Even allowing for the fact that the 
correlations are not all independent of each 
other this evidence testifies to the significance 
of the correlation matrix as a whole.” Three 
factors were extracted and centroid and ro- 
tated loadings are shown in Table 2. The 
third factor though of doubtful significance 
was retained to facilitate rotation. Rotations 
were carried out graphically and blindly to- 
wards “orthogonal simple structure.” 

The first point of significance is that the 
first unrotated centroid is a general factor 
with no negative loadings. Of the 38 items 
included in the analysis, 13 have loadings 
greater than .50, a further 11 have loadings 
of .40 and above, and only 2 items have load- 
ings below .20. Apart from the possibility 
that the factor reflects a general tendency to 
agree with the items in the questionnaire it is 
clear that this finding demonstrates that in- 
dividuals who admit to having accidents re- 
port having them in a wide variety of situa- 
tions, some of which, like losing things (load- 
ing .59), wearing out one’s clothes quickly 
(loading .52) and hurting people’s feelings 
(loading .46) would not normally be re- 
garded as accident situations. 

Interpretation of the rotated factors by 
means of the items with the highest loadings 
is by no means clear. Factor I’ is character- 
ized by Items 12, 26, and 41, viz.: 


12. Do you often burn yourself by touching hot 
places ? 

26. Do you often tend to lose or misplace things? 

41. Do you ever find that people’s feelings are hurt 
by things you say? 


The items with the highest loadings on 
Factor II’ are: 


35. Have you ever mistaken the time after looking 
at your watch? 
38. Do you have one or more scars on your body ? 
5. Have you ever accidentally torn a book or 
newspaper or similar object? 


Factor III’ is best demonstrated by the 
following items: 


* The intercorrelation matrix has been deposited 
with the American Documentation Institute. Order 
Document No. 6025 from the ADI Auxiliary Publi- 
cations Project, Photoduplication Service, Library of 
Congress, Washington 25, D. C., remitting in ad- 
vance $1.25 for 35 mm. microfilm or $1.25 for 
6X2 in. photocopies. Make checks payable to: 
Chief, Photoduplication Service, Library of Congress. 


313 


27. As you walk do 
things? 
40. Do you tend to get ink on your fingers while 
you are writing? 
. Do you sometimes misunderstand what people 
are saying to you? 


you sometimes trip over 


If instead of attempting to interpret the 
factors by reference to the most highly satu- 


Table 2 


Centroid and Rotated Factor Loadings* of Each of the 
Items in the Accident Index 
Entered in the Correlation Matrix 


Centroid 


Rotated 


Item FI: FIL FIIl FI’ FI’ FIIl’ 


% 


30 30 —15 40 
47 36 
— 23 33 
11 13 29 
-12 51 52 
12 36 


wm 
— me GW No 
Noun 


_ 


-unny ww 
Coe a 


w 
_ 


19 
13 2 04 47 
21 52 


A) —O5 


27 5 47 


~7 = Ww 


wre we uw 
_ 


— 36 22 
5 40 


39 
28 


04 
27 
48 
45 
18 
47 
56 


aw w 
tm ¢ + w Vv ¢ 
co sw 


~ 
sn 
_ 


60 
35 


mh Nw NM NM NK NW WY 

o ? mm ¢ 
mown 
Vins 


& 
S oo 


39 
~14 
23 
26 
04 


maounre 


mn hv 
oo 


w 
_ 2 


w 
-— 
Rud — wm & bd * 
Ps ¢ 


w 
oo 
4 
a. 


~18 
04 36 
37 - 58 
41 46 ‘ —22 12 


= & 
oso 
—_ Vi 
Ow 


® Decimal points omitted. 





314 J.D 


rated items a rough analysis of the content 
of the items is made and then compared with 
the factor pattern, a more consistent pattern 
can be seen. Thus Items 1, 4, 5, 11, 21, 22, 23, 
28, and 40 are all to do with manipulation of 
the hands. Of these nine items, seven have 
their highest loadings on Factor III’. Six 
items, Nos. 6, 12, 16, 20, 30, and 39 involve 
injury to an extremity where manipulation by 
that extremity is not necessarily implied. Of 
these items, four are most highly loaded on 
Factor II’ and the others on Factor I’. Items 
2, 7, 16, 31, 33, and 38 all pertain to gross 
bodily injuries and four of them have their 
highest loadings on Factor I’ and 2 on Factor 
II’. It is possible, then, tentatively to label 
the factors as depicting injuries due to ma- 
nipulation of the extremities, involvement of 
the extremities, and gross bodily involvement, 
respectively. It is recognized, however, that 
such an interpretation is highly speculative 
and, even if correct, may not generalize be- 
yond the sample used in the present study. 


Summary and Conclusions 


A factor analysis was performed on the in- 
tercorrelations between the responses of 100 
university students, most of whom were Arabs, 
to 38 statements about accidents and minor 
mishaps. A general factor was found to run 
through all the statements indicating that in- 
dividuals who admit to having accidents in 
one situgtion also indicate that they have been 
involved in accidents in other situations. Such 
a finding does not contradict the notion of 
“accident proneness” and suggests the possi- 


. Keehn 


bility that some minor accidents and mishaps 
might be predictive of subsequent major acci- 
dents. 

Rotation of the centroid axes to “orthogonal 
simple structure” yielded three group factors 
and an attempt was made to interpret those 
factors in terms of the kinds of items having 
their highest loadings on the respective fac- 
tors. While the particular findings of the 
present study may be restricted to the cul- 
ture in which the data were collected, it is 
felt that the method has been sufficiently sug- 
gestive to warrant further use in the study of 
accidents. 


Received November 25, 1958. 


REFERENCES 

ALEXANDER, F. Psychosomatic medicine. 
Allen and Unwin, 1952. 

Arpous, A. G., & Kerricu, J. E. Accident statistics 
and the concept of accident proneness. 
1951, 7, 340-432. 

CuestreE, L., Sarrir, M., & THurstone, L. L. Com- 
puting diagrams for the tetrachoric correlation co- 
efficient. Chicago: Univer. Chicago Press, 1933. 

DunsBar, F. Psychosomatic diagnosis. New York: 
Harper, 1943. 

Encutsu, O. S., & Pearson, G. H. J. Emotional 
problems of living. New York: Norton, 1945. 
KunkKie, E. C. The psychological background of 
“pilot error” in aircraft accidents. J. aviat. Med., 

1946, 17, 533-567. 

NaTIONAL SAFety CouNcII 
cago: Author, 1952 

Rowson, A. J. Accident proneness 
Med., 1944, 6, 88-94. 

Wess, W. B. The prediction of aircraft accidents 
from pilot-centered measures. J. aviat. Med., 
1956, 27, 141-147. 


London: 


Biometrics, 


Accident facts. Chi- 


Psychosomat 





Journal of Applied Psychology 
Vol. 43, No. 5, 1959 


JOB SATISFACTION STUDY OF TWO SMALL 
UNORGANIZED PLANTS 


B. J. SPEROFF 


Lithographers and Printers National Association 


The present study utilized the Tear Ballot 
for Industry, General Opinions, a measure 
which has been used and reported on fre- 
quently over the last fifteen years, on a group 
of 36 workers from two small independently 
owned and unorganized plants. “he test it- 
self consists of 10 items relating to job se- 
curity, company welfare, supervisory ability, 
working conditions, interpersonal relation- 
ships, income, communications, confidence in 
the “intentions” 7nd the “good sense” of the 
management, and personal happiness. The 
administration time for the test is extremely 
short, it is completely anonymous, and the 
testee merely tears his answers right on the 
sheet. 


Table 1 


Pearsonian Coefficients of Correlation Between 
Tenure Rate and Tear Ballot Items 


1. Does the company make you feel that your 
job is reasonably secure as long as you do 
good work? 

. In your opinion, how does this company 
compare with others in its interest in the 
welfare of the employee? 

. How does your immediate supervisor com 
pare with other managers, foremen, or sex 
tion leaders as to supervisory ability? 

. Considering your work, are your working 
conditions comfortable and healthful? 

. Are most of the workers around you the kind 
who still remember you when you pass them 
on the street? 

Do you think your income is adequate for 
your living needs? 

Do you feel that you have proper oppor 
tunity to present a problem, complaint, or 
suggestion to management? 

3. Do you have confidence in the good inten 
tions of the management? 

. Do you have confidence in the good sense of 
the management ? 

. What effect is your experience with the com 
pany having upon your personal happiness? 


Denotes significance at the 1°% level of confidence. 


Subjects and Procedure 


The personnel of two small independently owned 
unorganized plants—one manufacturing lawn and 
porch furniture (N = 22) and the other hand-woven 
machine belts (N = 14)—were administered coded 
tear ballots. The job tenure rate (total years on 
the labor market divided by the number of jobs held 
during the same period) for each worker was com 
puted and utilized “as an independent criterion 
against which to correlate the individual job satis- 
faction items of the Tear Ballot for Indusiry” (Kerr, 
1948, p. 279). Table 1 summarizes these correlates 


Validity and Reliability 


The purpose of this study was to test the 
validity of the tear ballot on the premise that 
the higher the job satisfaction scores, the 
lower will be the job-related interpersonal 
communicative contacts between labor and 
management members. The number of job- 
problem sessions for a period of one year was 
thus utilized as the validation criterion based 
upon the hypothesis that the job-satisfied and 
happy worker has less job-related interview 
sessions than does the job-dissatisfied or un- 
happy worker; i.e., the frequency of such ses- 
sions should be inversely related with job 
satisfaction. Combining the data from both 
plants, a Pearsonian coefficient of correlation 
of —.76 was found between job satisfaction 
scores and the number of job-related inter- 
view sessions. 

Reliability was established by retesting the 
sample two weeks later. This coefficient of 
correlation was .81, as compared to the ac- 
cumulated mean of .83 reported by Kerr 
(1951). 


Received November 25, 1958. 


References 


Kerr, W. A. 
job satisfaction tear 
1948, 32, 275-281. 

Kerr, W. A. 
faction. Amer. Psychologist, 1951, 
stract) 


On the validity and reliability of the 
ballot. J. appl. Psychol, 


Validation of a measure of job satis- 
6, 360. (Ab- 





Journal of Applied Psychology 
Vol. 43, No. 5, 1959 


NUMERICAL ERROR CHECKING 


E. T. KLEMMER 


IBM Research Center, Yorktown Heights, New York 


Numerical error checking has often been 
used as an item in clerical aptitude tests, but 
there has been little interest in the psycho- 
logical aspects of error checking itself. The 
present study addresses itself to two impor- 
tant questions about numerical error check- 
ing: (a) What is the effect of grouping digits 
on the speed and accuracy of error checking? 
(6) How does the probability of error affect 
the speed and accuracy of error checking? 
The present study was designed to provide 
answers to these questions for the situation in 
which § checks numbers on one page against 
numbers on another page. This is fairly rep- 
resentative of many actual error checking 
situations, although some specialized tasks 
may require a very different spatial separa- 
tion of the numbers. 

Another aspect of error checking behavior 
which is of interest to the psychologist is the 
rate of information processing by S. Error 
checking requires so little overt output that 
speed is, for all practical purposes, completely 
limited by input and internal processing re- 
strictions. In this study estimates are made 
of the rate of handling information during 
error checking. 


Method 


The numbers to be checked were printed on pairs 
of separate 84-in. by 11-in. pages. Both pages of 
each pair had the same format of random numerals 
and most of the numerals were the same on both 
pages. S’s task was to mark those digits which were 
different. 

Every page contained 32 rows of numbers with 
a space between each group of 4 rows. The number 
of columns varied somewhat because of differences 
in the horizontal grouping (which was one of the 
experimental variables), but all pages were within 
the range of 32-40 columns so that the total number 
of digits per page varied only between 1024 and 
1280. All pages were printed by offset from plates 
prepared directly on the printer output of an IBM 
704 computer. Figure 1 shows one actual page of 
a typical pair. The other page of the pair differed 
only in that some of the numbers were different. 


Groupings 


Ten different horizontal groupings were employed 
involving groups of one through ten digits. The 
space between groups was enlarged as the number 
of digits in each group was increased in order that 
the over-all matrix of numbers would be about the 
same width for all groupings. 


Error Probability 


Three different error probabilities were used: 0.1, 
.01, and .001. Error probability is defined as the 
probability that any digit on one page will be dif- 
ferent from the digit in the corresponding position 
on the comparison sheet. For example, for error 
probability .01 approximately one digit in a hundred 
was changed on one sheet of each pair. It was 
changed to a digit chosen randomly from the nine 
remaining digits. Note that since the errors were 
determined probabilistically, the number of errors 
per page was ‘not controlled exactly in any test. 
Each of the three error probabilities was used with 
each of the ten groupings, making a total of 30 dif- 
ferent pairs of pages, hereafter called tests. Three 
alternate forms of the 30 tests were available and 
are designated A, B, and C. 


Subjects 


Volunteer college students served as paid Ss in 
all tests. 


Procedure 


Two separate studies were run with separate groups 
of Ss. One group of 30 naive Ss took all 30 of the 
tests (pairs of pages) of Form A in a balanced 
design. That is, all Ss took all tests in order, but 
each S started with a different test. The order of 
tests was such that grouping changed between tests, 
but error probability changed only twice during the 
experiment. Another group of four Ss took a series 
of practice trials on tests of Form A and then tests 
of Form B and Form C. The four trained Ss thus 
took two tests for each grouping and error proba- 
bility, but because of time limitations the tests with 
grouping by 7 and 9 were omitted. Each of the four 
Ss took the 48 tests in a different order. The order- 
ing was symmetrical with respect to error-probability 
and grouping. Both studies were conducted with 
sessions approximately 40 min. long during which 
the Ss completed two to four pairs of test pages, 
depending upon grouping and error probability. 
All Ss were instructed to work as fast as possible 
and still check every number on the page. The time 
taken for each pair of pages was recorded to the 
nearest 4% min. for each S. 





Numerical Error Checking 


652224 
63137 
917130 
108394 


058749 
325124 
158309 
399360 


$32332 
176042 
599807 
889879 


17452 
898511 
g25295 
690439 


721307 
747416 
£77900 
596849 


228479 
572236 
Bl4882 
236738 


625716 
017603 
262375 
089768 


Sfof07 
89437 
989432 
875564 


385047 
244560 
122f88 
348898 


965376 
58760 
650857 
390592 


814024 
269673 
829599 
438921 


09f954 
115259 
402916 
32836Z 


804371 
055603 
594699 
406828 


823227 
144094 
162199 
153608 


4652U4 
03462 / 
718395 
622106 


042920 
64807 
417826 
058024 


894519 
341569 
011902 
5 £3611 


328257 
082071 
308482 
655921 


252469 
597419 
459462 
838546 


882898 
354506 
368998 
Zi11l770 


$63397 
043857 
5Z9276 
032059 


423958 
05198 
150208 
006043 


528336 
279248 
119233 
152186 


548485 
129716 
258934 
838450 

Fic. 1. 
format, but some of the numbers were different. 
digits which were different (errors). 
ability of 0.1. 


The Ss indicated the errors by drawing a single 
line through the discrepant digits on one page of 
the pair only. In the thirty-S, experiment, each S 
was informed of the actual number of errors on each 
test immediately after finishing the test. This knowl- 
edge of results was not given in the study with only 
four Ss because the error keys were not available at 
that time. 


Results 
Speed 
The results of both experiments are shown 
in Fig. 2 and 3, which plot the speed of check- 
ing (pairs of digits compared per second) 
against the number of digits in the horizontal 
groups. Note that, in all cases, the speed is 


Typical comparison page of a test pair. 


Page shown illustrates grouping by 


131044 
159662 
749704 
683065 


899358 
721420 
963526 
904942 


687592 
060073 
442684 
682420 


113115 
139440 
781410 
BI3043 


463137 
917130 
X08394 
058749 


661910 
718641 
720803 
084637 


192854 
XX6087 
879743 
939722 


325124 
158309 
39390 

432332 


176042 
599804 
889379 
#65376 


587460 
650852 
390592 
804371 


091207 
295187 
571801 
898526 


048646 
061184 
777315 
992352 


353974 
749675 
154378 
240686 


419206 
353393 
199007 
060599 


055603 
594699 
406828 
328257 


537921 
Z38846 
652604 
801884 


624317 
270936 
3437Z7 
859296 


082071 
308482 
655921 
252469 
797147 
102636 


3FO170 
249158 


433417 
341412 
919348 
326358 


597419 
459462 
838546 
423958 


£05198 
150208 
006043 
381340 


904956 
885548 
TXLX969 
502009 


505658 


4748 RR 
944747 
065832 


The standard page of the pair had the same 
The slash marks were made by S to denote those 
six with an error prob- 


lowest for groups of one digit, highest for 
groups of three or four digits, and then falls 
off again for larger groups. The average per- 
formance of both sets of Ss is such that speed 
of checking with groups of one is only 56% 
of the speed with groups of three, and speed 
of checking with groups of ten is 67% of the 
speed with groups of three. 


Accuracy 


The accuracy of checking did not change 
in any regular way with the size of horizontal 
grouping, but differences for the three error 
probabilities were noted. A greater percent- 
age of errors is detected by the Ss when many 





w 
~*~ 


> 





30 NAIVE Ss 





w 











vA 
¢ 


‘ 
l PROBABILITY NOT DETECTED 





o— Ol 13 
— 

fae 

4 








z... - 
| x— .0OI 24 


4 
tt =a eM 
5 6 7 8 9 10 
~ p EACH HORIZONTAL GROUP 


°o 





PAIRS OF DIGITS COMPARED PER SECOND 
N 


° 


Fic. 2. Speed of numerical error checking as a 
function of horizontal grouping and error probabil- 
ity. Each point represents the average of 30 Ss and 
is based on time scores for checking a pair of pages 
with about 1000 digits on each page. 


errors are present, as shown in Fig. 2 and 3. 
Correct detection is 96% for error probability 
.1, 87% for error probability .01, and 76% 
for error probability .001 in the thirty-S ex- 
periment. The four practiced Ss showed a 
range from 98% to 83% for the same tests. 
Note that the number of errors remaining un- 
detected is directly related to the original 
number of errors. For the .001 error-probabil- 
ity tests only 0.2 errors remain per thousand 
digits after checking. For the .1 error-proba- 
bility test, two and four errors per thousand 
digits remain after checking by the average 
practiced and naive S, respectively. 

The number of false marks (correct num- 
bers marked as errors) is small compared to 
the mistakes of omission and also shows no 
consistent change with grouping. For the 
thirty-S experiment the number of false marks 
averaged 1% or less of the total marks made 
in tests with each of the error probabilities. 
The four practiced Ss made false marks num- 
bering only 0.4% of the total marks in the 
.1 error-probability test and no false marks 
in the other tests. 


Informational Measures 


In the present study the following informa- 
tion processing operations are involved. First 
the S must locate, perceive, and store tempo- 
rarily one or more digits from the standard 


E. T. Klemmer 


page of each pair; locate and perceive the 
corresponding digits from the comparison 
page; then compare the two sets of digits to 
detect differences. Since there appears to be 
no simple way of estimating the information 
involved in locating the digits, spatial in- 
formation will be neglected in the following 
analysis. 

After locating any particular digit on the 
standard page, the perception and temporary 
storage of that digit represent assimilation 
of one out of ten equally likely alternative 
digits. In terms of Shannon-Weiner measure 
of information, this represents log, 10 or 3.32 
bits of information. Since the comparison 
digits are highly predictable from the stand- 
ard digits, the information content of the 
comparison digits is low. The average uncer- 
tainty of a comparison digit is .02 bits, .11 
bits, and .79 bits for error probabilities .001, 
01, and .1, respectively, as calculated from 
the probability distribution over all possible 
comparison digits with knowledge of the cor- 
responding standard-page digit. The total 
informational input for each pair of digits 
(neglecting position information) is thus the 
sum of the uncertainties of each digit of the 
pair or 3.34, 3.43, and 4.11 bits for error 
probabilities .001, .01, and .1, respectively. 

The total information input as calculated 
above is not actually processed by the S, that 
is, he does not perform perfectly. The S may 
make mistakes at any stage of locating, per- 
ceiving, comparing, and marking the digits. 
We cannot accurately determine from the 
final results how many mistakes were made 
in each operation, but fortunately there is 
good evidence that the great bulk of mistakes 
are of a single class. Consider that mistakes 
of perceiving, comparing, or marking would 
almost certainly lead to more false positive 
mistakes in marking the test page than mis- 
takes of undetected errors since 90% to 
99.9% of the digits are subject to false posi- 
tive mistakes and only .1% to 10% of the 
digits are subject to mistakes of omission. 
The actual data from the Ss show that false 
positives make up only a small fraction of the 
total mistakes. This strongly suggests that 
most of the mistakes made by the Ss are 
caused by failures to compare some pairs of 
digits at all. That is, most failures in the S’s 





Numerical Error Checking 


performance are due to skipping blocks of 
digits on both the standard and comparison 
pages. Note that if S skipped a different 
number of digits on the two pages it would 
lead to many apparent errors close together, 
which condition the S knows is highly im- 
probable. Therefore, this type of skipping is 
largely self-correcting. 

If the undetected errors are considered due 
to the S’s failure to compare the digits in- 
volved, then it may fairly be assumed that 
the S also fails to compare many of the digits 
which are actually the same. His over-all 
speed of performance may therefore be rea- 
sonably reduced by the proportion of un- 
detected errors. Clearly this correction should 
be applied separately to each S and Test but 
for the .001 error-probability test (where the 
correction is most important) the number of 
errors is so small (zero to four per page) that 
no reasonable estimate of percentage of un- 
detected errors is possible for each S sepa- 
rately. The correction for undetected errors 
is therefore made simply by multiplying the 
average speed of checking by the proportion 
of errors correctly detected. Since the per- 
centage of correctly detected errors shows no 
regular variation with grouping, the percent- 
ages are also averaged over groupings. 





| 


4 PRACTICED Ss 




















vs. 
"ie 
ale 











ERROR PERCENT 
PROBABILITY NOT DETECTED | 
— 

x—.001 17 

o —.0l 13 7 
a a 
2 3 a 5 6 ; * 9 10 
DIGITS IN EACH HORIZONTAL GROUP 





PAIRS OF DIGITS COMPARED PER SECOND 














° 





Fic. 3. Speed of numerical error checking as a 
function of horizontal grouping and error probabil- 
ity. Each point represents the average of 4 Ss and 
is based on time scores for checking two pairs of 
pages with about 1000 digits on each page. 


4 


NAIVE Ss 


“.46 


E 
PROBABILITY 

+—-0.1 

o—Ol 

x— OO| 


INFORMATION IN BITS PER SECOND 


1@] ! 2 3 4 5 6 + & 9 Ww 
DIGITS IN EACH HORIZONTAL GROUP 


Fic. 4. Rate of information handling in numerical 
error checking as a function of horizontal grouping 
and error probability. See text for a description of 
the method used in deriving informational rates from 
the speed scores of Fig. 2 and 3. 


The speed of checking, as shown on the 
ordinates of Fig. 2 and 3, was corrected for 
mistakes as described above and then multi- 
plied by the total information inputs 3.34, 
3.43, and 4.11 bits as derived above. Fig- 
ure 4 plots the informational rates against 
grouping for both sets of Ss and all three 
error probabilities. Figure 4 shows that the 
Ss were working at approximately the same 
informational rate even though the error 
probability varied over a 100: 1 range. The 
difference between the four practiced Ss and 
the 30 unpracticed Ss is maintained. The 
shape of the plot of information rate against 
grouping is the same as the shape of the un- 
corrected speed curves of Fig. 2 and 3, since 
the correction to the information measure is 
not a function of grouping. 


Summary and Conclusions 


Speed and accuracy of numerical error 
checking were studied as a function of the 
probability of randomly placed errors and 
horizontal grouping of the digits. Ten group- 
ings (1 through 10 digits) and three error 
probabilities (0.1, .01, and .001) formed the 
basis of tests given to four practiced Ss and 
30 naive Ss. 

The speed of error checking was highest 
for groupings of three or four digits and fell 
off for smaller or larger groups. Compared 
with grouping by three, groups of one digit 





320 


were checked an average of 44% slower and 
groups of ten were checked an average of 
33% slower. 


Speed of checking was inversely related to 
error probability so that the .001 error-prob- 
ability tests were checked most rapidly and 
the 0.1 error-probability tests, most slowly. 
This increased speed on the low error-prob- 
ability tests was accompanied by a higher 
percentage of undetected errors so that the 


E. T. Klemmer 


Ss were handling information |at about the 
same rate with all three error probabilities. 

The accuracy with which Ss checked errors 
showed no identifiable variation with hori- 
zontal grouping, even though speed of check- 
ing varied greatly with size of horizontal 
group. The great majority of the S’s mis- 
takes were failures to detect errors actually 
present. 


Received December 1, 1958. 





Journal of Applied Psychology 
Vol. 43, No. 5, 1959 


COGNITIVE SIMILARITY AND INTERPERSONAL 
COMMUNICATION IN INDUSTRY ' 


HARRY C. TRIANDIS 2 


Cornell University 


The present paper reports a test of the hy- 
pothesis that cognitive similarity affects the 
process of interpersonal communication. It 
presents methods for the measurement of cog- 
nitive similarity and shows that the measures 
obtained are related to perceived effectiveness 
of communication and liking between two 
people. Since permanent, long-standing rela- 
tionships were necessary for purposes of the 
study, supervisors and subordinates in indus- 
try were used as Ss. Other pairs, such as 
child-parent, therapist-patient, or student- 
teacher, could have been used, though each 
presents special difficulties. A laboratory 
replication of the study has been reported 
elsewhere (Triandis, 1959a). 

Two kinds of cognitive similarity are con- 
sidered. The first, categoric similarity, is ob- 
tained by comparing the categorizations of 
two Ss, through an adaptation of Kelly’s 
(1955) Role Repertory Test. The second, 
syndetic similarity, is obtained by comparing 
the ways concepts are associated with other 
concepts, and uses Osgood’s (1952) semantic 
differential. 

Recent studies of perception (Hayek, 1952) 
and thinking (Bruner, Goodnow, & Austin, 
1956) have emphasized the importance of 
categorization. If categorization is central to 
these processes it should also be important in 
interpersonal communication. That is, if two 
people categorize events, objects and concepts 
in similar ways they should be able to com- 
municate more effectively. 

The work of Osgood and his associates 
(Osgood, Suci, & Tannenbaum, 1957) stresses 
the importance of the “semantic space” in 
phenomena related to attitudes and communi- 


1 This paper is based on portions of the writer’s 
doctoral dissertation. The author gratefully ac- 
knowledges the guidance and help of W. W. Lam- 
bert, T. A. Ryan, and W. F. Whyte. The larger 
study, of which this is a part, was supported by a 
grant from the Foundation for Research on Human 
Behavior. 

2 Now at the University of Illinois. 


2 
3 


21 


cation. It seems a reasonable hypothesis that 
if two people have similar-“‘semantic spaces” 
they should be able to communicate more 
effectively. 

Cognitive similarity is related to additional 
variables. Newcomb (1953, 1956, 1958) sug- 
gests the following model: If A and B are 
cognitively similar and there is an opportunity 
for communication (propinquity), the com- 
munication will be more effective, the rela- 
tionship between A and B will be more re- 
warding, and A and B will therefore like each 
other more than if A and B are not cogni- 
tively similar. Cognitive similarity implies a 
similar orientation towards X, in Newcomb’s 
A-B-X model. Increased liking leads to higher 
rates of interaction between A and B and this, 
in turn, permits greater cognitive similarity 
thus starting the cycle all over again. 

This paper relates categoric similarity and 
syndetic similarity to perceived communica- 
tion effectiveness and liking of the supervisor 
by the subordinate. The hypotheses that are 
tested may be stated as follows: (a) The 
higher the communication effectiveness be- 
tween supervisor and subordinate, the more 
the liking of the subordinate for the super- 
visor. (6) The higher the categoric simi- 
larity between the supervisor and subordinate 
the greater the communication effectiveness 
and the more the liking of the subordinate 
for the supervisor. (c) The higher the syn- 
detic similarity between the subordinate and 
the supervisor, the greater the communication 
effectiveness and the liking of one for the 
other. 


Method 


The study was conducted in an industry employing 
300 people. Approximately one half of the em- 
ployees participated in the study. Details on the 
company and the Ss can be found in Triandis (1958 
or 1959b). 

Procedure. (a) Categoric similarity: Twelve triads 
of jobs and 12 triads of people were presented to the 
Ss (see Triandis 1958 or 1959b for exact jobs and 





Harry C. 


Table 1 


Intercorrelations between the Main Variables 


0, red 

14 04 
35 
49** 
4l* 


mm 


*p <.10 
* > < 05. 


people). The Ss were asked: “Which one of these 
three jobs (people) is more different from the other 
two?” “Why?” and “What is the logical opposite 
of the characteristic that makes it different?” Thus, 
we obtained lists of characteristics of jobs and peo- 
ple and their logical opposites. These lists were then 
subjected to a content analysis and rated as to their 
similarity by the two judges. The corrected inter- 
reliability was .92 for people and .87 for jobs. 
istructions to the raters and the rating scale 
e found in Triandis (1958 or 1959a). (b) Syn- 
similarity: A semantic differential was con- 
structed for jobs and another one for people. Most 
of the scales of these differentials were relevant to the 
concepts that were to be rated on them. Twenty- 
eight of the scales of the differentials were obtained 
from a stratified random sample of the lists of char- 
acteristics obtained from the categoric similarity pro- 
cedure described under (a) above. Ten additional 
scales were selected so as to represent the seven fac- 
tors of Osgood et al. (1957, pp. 62-64). Eleven con- 
cepts were judged against these scales.. They were: 
a welder’s job, a teacher’s job, a personnel director's 
job, a vice president’s job, and a clerk’s job. The 
sequence of these jobs was counterbalanced, for every 
group of Ss. The people-concepts used were: Dick 
T. (the personnel director of the company), your 
supervisor, the your supervisor, the vice 
president of your division, a fellow at work whom 
you like, and an effective manager you have known 
well and who is not the same as any of the men al- 
ready rated. The instructions as well as the exact 


detic 


boss of 


Triandis 


semantic differentials used may be found in Triandis 
(1958, pp. 296-298). The test-retest reliability of 
the differentials for 20 workers was .83 and .92. The 
syndetic similarity was computed from 


Le 
De 


0=1-—- —=1- 
; 36n , 36 


where nm is the number of scales over which the dif- 
ference d between the ratings of the two Ss is being 
summed. The constant 36 = 6° is due to our use of 
a seven-point scale. Five jobs and three people were 
used in the computation of the syndetic similarity 
coefficients. (c) Communication effectiveness and 
liking scales. Two scales were constructed, one for 
each variable. The Thurstone method of successive 
intervals (Edwards, 1957) was used; The items and 
scale values can be found in Triandis (1958, pp. 110- 
112). The parallel form reliability of these attitude 
scales, using 45 college student Ss, was .88. The 
scales were subjected to a scalogram analysis and 
yielded Guttman coefficients of reproducibility of .85 
and .88, respectively. The two scales were highly 
intercorrelated. For 31 female clerks r= .76; for 31 
male clerks r = .84; for 42 managers r = .83 and for 
51 workers r= .92 


Results 


Correlational analysis. Since we considered 
independent pairs of supervisors and subordi- 
nates we could only use 20 such pairs for this 
analysis. Table 1 shows the matrix of inter- 
correlations. 

Factor analysis of the matrix of intercor- 
relations. The matrix of intercorrelations 
which consists of the correlation coefficients 
of variables K;, K,, O;, and O, as well as C, 
and LZ was factored by means of Thurstone’s 
(1947) centroid method. Three factors were 
extracted and rotated for simple structure. 
The unrotated and rotated matrices are shown 
in Tables 2 and 3. 

The first factor, which accounts for 31.3% 


Table 2 


Unrotated Factor Matrix 


Variable 


Categoric similarity—jobs (K;) 


Categoric similarity 


people (K,) 
Syndetic similarity—jobs (O;) 


Syndetic similarity —people (O, 
Communication effectiveness (C,) 


Liking for supervisor (L) 


, of Matrix of Intercorrelations between Our Main Variables 


2 
a;* 


—.176 
320 
394 
391 
419 
445 


192 
143 
021 
045 
112 





Cognitive Similarity and Interpersonal Communication 


Table 3 


Rotated Factor Matrix of Matrix of Intercorrelations between Our Main Variables 


Variable a a2 


.670 .204 
.690 
096 405 
.276 
237 
.143 


Percentage of total variance 


of the variance accounted for is a categoric 
similarity factor. The second factor accounts 
for 13.7% of the variance and may be called 
a syndetic similarity factor. The third factor 
accounts for 55% of the variance and is satu- 
rated with L, C,, O;, and O,. It may be called 
an evaluative factor. 

The regression equations. The means and 
standard deviations of the six main variables 
are presented in Table 4. 

Using the standard methods for the deter- 
mination of regression equation (McNemar, 
1949, Chap. 9), including Doolittle’s method, 
we obtained the following equations, expressed 
in standard form. 

2'c, = .0001 2x, + .289 2x, 
373 20; + “ 
= .168 


153 zo, 

2K; aa 428 5K, 

20; + 221 20, 

C. = — 1.6K; + 9.4K, + 38.50, 
45.20, ~ 33% 


380 


[5] 
with an error of 2.2. 
L= -—19K;+3.2K,+ 9.30; 

+ 1.90, — 2.48 [6] 
with an error of .45. 

If we multiply both sides of Equation [6] 
with 4.15, to equalize the coefficient of O;, we 
can compare [5] and [7] more conveniently. 
4.15 L = — .79K;+ 13.2 K, + 38.50; 

+ 7.90, — 10.2 [7] 

Thus, the communication effectiveness and 


liking for supervisor scores can be predicted 
from the knowledge of the categoric simi- 


a? Pd ’ nd 


A58 . 491 
A485 j 533 
.009 537 
.076 : 400 
.056 F 486 
.020 , .580 


a3 13.7 100.0 


larity, and syndetic similarity coefficients. 
The multiple y for the liking for supervisor 
scores is .61 (p < .003), and the one for the 
communication effectiveness scores is .51 (p 
< .02). The most effective predictor of either 
communication effectiveness or liking for su- 
pervisor is the syndetic similarity for jobs. 
The second most effective predictor is the 
categoric similarity about people. The other 
two cognitive similarity coefficients are in- 
effective. 

The analysis of variance. The correlation 
procedures that gave the results reported 
above have one great deficiency; they waste 
data. Each correlation is based on only a 
few supervisor-subordinate pairs because of 
the requirements for independence. Since re- 
sults based on small samples are less convinc- 
ing, and significant relationships are not easily 
obtained with such samples, it is desirable to 
use other statistical procedures. Analysis of 
variance is the appropriate technique. If each 
supervisor is considered a different “treat 
ment,” then it is possible to use many more 
Ss in our computations. We have, then, two 
classifications of the data; one according to 


Table 4 
Means and Standard Deviations of Variables 
Variable M 
.128 
.097 
.920 
.926 
Communication effectiveness (C,) 38 


Categoric similarity—jobs (K ;) 
people \ K, ) 
j »bs O; ) 


Categoric similarity 
Syndetic similarity 
Syndetic similarity—people (O,) 


Liking for supervisor (L 


5 
/ 





Harry C. Triandis 


Table 5 
The Results of the Analyses of Variance—Summary 
Level of 


Significance 
(One-tail) 


Percentage 
of Total 
Variance 


Independent 


Dependent 
Variables 


Variables 


N used in 
Analysis 


Double Classification Analyses 


p < .02 70 
p< 01 70 
N.S. 60 
N.S. 60 
p< .10 5. 60 
N.S. “ 60 
p < .001 
p< 01 
N.S. 
N.S. 
p < .025 
p < .03 


. 7. x + . 


bee fe Oe A hee A le GO 


AKRKHRANnNDURNAURUNND 
. 


Triple Classification Analyses 


p < .125 (for O) 
.200 (for K) 
p< 30 
p < .001 (for O) 
N.S. (for K) 


S + Oss + KpCe 


Nh 
ro 


S+Ouwi+KL 
S+0;+KpL 


sO 


we 


Note.—S = supervisor; K = categoric similarity; O = syndetic similarity ; p = people; j = jobs; j + p = average of fj +p 
scores; Ce = communication effectiveness; L = liking for supervisor; and N.S. = nonsignificant. 


cognitive similarity, the other according to 


partment of Animal Husbandry, a mathemati- 
cal geneticist, solved the problem after a re- 
quest from this writer. 

Table 5 presents a summary of all the 
analyses of variance. The results of these 


supervisor. Since there is a variable ‘number 
of subordinates reporting to each supervisor, 
however, we have unequal subclass n’s. Also, 
in some cases we have had missing cells (all 


subordinates of a given supervisor were either 
very similar, or very dissimilar). We avoided 
a large number of missing cells by excluding 
from our analyses supervisors who had only 
one subordinate for whom we had complete 
data. Even with this restriction, however, 
we had a number of missing cells—in other 
words, we did not have the standard type of 
analysis of variance. Analyses of variance 
with missing cells and unequal n’s are de- 
scribed in Snedecor (1956, pp. 382-385). 
About twenty such analyses were undertaken. 
The most interesting will be discussed below. 

A triple classification analysis of variance is 
particularly suitable for our data (effect of 
categoric similarity, syndetic similarity, and 
supervisor). Such analyses were not avail- 
able for unequal n’s and missing subclasses 
when the analyses were first undertaken. 
Professor C. R. Henderson, of Cornell’s De- 


analyses are as follows: 

1. Categoric similarity based on people is 
significantly related to both communication 
effectiveness and liking for supervisor. It 
takes care of 6.0 and 6.6% respectively of the 
variance of scores. 

2. Categoric similarity based on jobs is not 
significantly related to either communication 
effectiveness or liking for supervisor. 

3. If we average the categoric similarity 
scores we can predict communication effec- 
tiveness, accounting for 5.7% of the variance, 
but not liking for supervisor. 

4. Syndetic similarity about jobs is highly 
related to both communication effectiveness 
and liking and accounts for 6.6 and 4.9% of 
the variance. 

5. The results of the triple classification in- 
dicate that syndetic similarity is a much more 
important variable than categoric similarity. 





Cognitive Similarity and Interpersonal Communication 


Table 6 


Analysis of Variance of Communication Effectiveness 
Scores Classified According to Supervisor and 
Levels of Syndetic Similarity About Jobs 


(Management Group Only) 


Per- 
centage 
of 
Source SS Variance 


df Variance F 


4.45** 60 
8.40* 9 
7 


Supervisor 646.45 17 48.0 
O; 91.44 1 91.0 
S X 0; 70.35 = 11 6.4 
Individual 


Differences — 10.8 


Total 


*p < 01. 
* > < .001. 


1070.00 156.2 


In addition to these analyses, in the case of 
syndetic similarity on jobs (O;) we have 
enough cases to make separate analyses for 
workers, clerks, and managers. 

Table 6 presents the analysis of variance 
results for the management group; Table 7 
for the clerks, Table 8 for the workers, and 
Table 9 for all groups combined. 


Discussion 


Examination of Tables 6, 7, 8, and 9 shows 
that both differences in level of syndetic simi- 
larity about jobs and differences in supervisor 
determine portions of the variance of com- 
munication affectiveness scores. This phe- 
nomenon is most clear with the management 


Table 7 


Analysis of Variance of Communication Effectiveness 
Scores Classified According to Supervisor and 
Levels of Syndetic Similarity About Jobs 


(Clerks Only) 


Per 
centage 
of 
Variance 


Source SS Variance . F 


7.31 30 
29.61 1.43 15 
8.38 18 


< 58.55 
O; 29.61 
SX 0; 33.52 
Individual 

Differences 19.98 37 


Total 192.59 65.28 


Table 8 


Analysis of Variance of Communication Effectiveness 
Scores Classified According to Supervisor and 
Levels of Syndetic Similarity About Jobs 


(Workers) 


Per 

centage 
. 0 

Source SS df Variance F Variance 
i324. 237° 14 
10.91 1.84 3 


8.42 8 


S 45.72 3 
0; 10.91 1 
SX 0; 25.26 3 
Individual 

Differences 41 5.92 


*p < .05. 


group, less clear with the clerks and least 
clear with the workers. One is tempted to 
generalize that the extent to which the job 
held by the Ss is “intellectual”? determines 
the influence of syndetic similarity about jobs 
on the communication scores. It may be that 
when a S has a manual job, his perception of 
that job and other jobs is not very important 
in terms of communication with his super- 
visor. Very often the worker takes a job that 
pays X dollars per hour and is not very con- 
cerned with the nature of the job. The su- 
pervisor tells him what the job is and he does 
it. With professional jobs, however, such as 
with engineers or managers, differences in the 
perception of jobs between supervisor and 
subordinate appear to be crucial. 


Table 9 


Analysis of Variance of Communication Effectiveness 
Scores Classified According to Supervisor and 
Levels of Syndetic Similarity About Jobs 


(All Groups) 


Per 
centage 
0 
Variance 


Source SS df Variance 


S 868.69 17 
0; 110.65 1 
SX 0; 150.44 7 
Individual 

Differences 112 


51.44 
110.65 
21.49 


6.81 


*p < O01. 
** p< 001, 





32¢ Harry C. 


Explanation of the relative effectiveness of 
the four indices of cognitive similarity. Syn- 
detic similarity based on jobs and categoric 
similarity based on people were the only in- 
dices that were related to communication ef- 
fectiveness and liking. This requires an ex- 
planation. 

C. E. Osgood, in a private communication, 
suggested that the difference in the effective- 
ness of the two syndetic similarity indices is 
due to differences in the representativeness of 
the concepts rated. He argued that the jobs 
used in the computation of the syndetic simi- 
larity coefficients for jobs were more diverse 
and representative. They were welder, teacher, 
vice president, clerk, and personnel director. 
The people used in the syndetic similarity 
coefficients for people, on the other hand, 
were more homogeneous. They were the per- 
sonnel director, a supervisor, and the vice 
president of the employee’s division—all “su- 
pervisory.” This seems a reasonable explana- 
tion. It suggests further research to establish 
whether in fact one would get an even higher 
correlation between liking and syndetic simi- 
larity when extremely diverse concepts are 
rated by two Ss. One might conceivably ex- 
tend this explanation to explain also the 
greater effectiveness of categoric similarity 
based on people as compared to the cate- 
goric similarity based on jobs. 

There is then, some evidence that certain 
kinds of cognitive similarity are related to 
communication effectiveness and liking be- 
tween two Ss. Whether this is a specific or 
a general phenomenon is subject for further 
research. A laboratory test of the hypothesis 
tested in the present paper (Triandis, 1959a) 
suggests that it is a sufficiently stable phe- 
nomenon to deserve further study. 


Summary 


One hundred and fifty-five Ss responded to 
12 triads of jobs and 12 triads of people. The 
Ss were asked to state “Which job (person) 
is more different from the other two?” and 


“Why?” The responses of subordinates and 
supervisors to these triads were compared by 
two judges. If the responses were judged to 
be similar the index of categoric similarity of 
the pair was high. The same Ss were asked 
to rate five jobs and six people on specially 


Triandis 


constructed semantic differentials. Similarity 
of the “semantic profiles” obtained indicated 
high syndetic similarity between a boss and 
a subordinate. Successive interval scales on 
perceived communication effectiveness and 
liking within the boss-subordinate pair were 
also constructed. Correlational analyses and 
analyses of variance showed .an association 
between categoric similarity based on people 
and syndetic similarity based on jobs and 
communication effectiveness and liking within 
the pair. This is considered evidence sup- 
porting the hypothesis that cognitive similar- 
ity is a significant variable in interpersonal 
communication and liking. 


Received January 22, 1959. 
REFERENCES 


Bruner, J. S., Goopnow, J. J., & Austin, G. A. A 
study of thinking. New York: Wiley, 1956. 

Epwarps, A. L. Techniques of attitude scale con- 
struction. New York: Appleton-Century-Crofts, 
1957. 

Hayek, F. A. The sensory order: An inquiry into 
the foundations of theoretical psychology. Chi- 
cago: Univer. of Chicago Press, 1952. 

Jenkins, W. L. A quick graphic method for the 
product moment “r.” Educ. Psychol. Measur., 
1945, 5, 437-443. 

McNemar, Q. Psychological Statistics. 
Wiley, 1949 

Newcoms, T. M. An approach to the study of 
communicative acts. Psychol. Rev., 1953, 60, 393 
404. 

Newcoms, T. M. The prediction of interpersonal 
attraction. Amer. Psychologist, 1956, 11, 575-586 

Newcoms, T. M. The cognition of persons as cog- 
nizers. In R. Tagiuri & L. Petrullo (Eds.), Per- 
son perception and personal behavior. Stanford: 
Stanford Univer. Press, 1958. 

Oscoop, C. E. The nature and measurement of 
meaning. Psychol. Bull., 1952, 49, 197-237. 

Oscoop, C. E., Sucit, G. J., & TANNENBAUM, P. H. 
The measurement of meaning. Urbana: Univer 
of Illinois Press, 1957. 

Snepecor, G. W. Statistical methods. 
Iowa State Coll. Press, 1956. 

TuHurstone, L. L. Multifactor analysis. 
Univer. of Chicago Press, 1947. 

Trianpis, H. C. Some cognitive factors affecting 
communication. Unpublished doctoral disserta- 
tion, Cornell Univer., 1958. 

Trianpis, H. C. Categoric similarity and the com- 
munication of the dyad. Sociometry (in press), 
1959. (a) 

Trianpis, H. C. Categories of thought of managers, 
clerks and workers about jobs and people in in 
dustry. J. appl. Psychol., 1959, 43, 338-344. (b) 


New York: 


Ames, Iowa: 


Chicago: 





Journal of Applied Psychology 
Vol. 43, No. 5, 1959 


A FEMININITY ADJECTIVE CHECK LIST ' 


RALPH F 


Student Counseling Bureau, 


In spite of valiant attempts to construct 
theories to explain the development of voca- 
tional interests (Bordin, 1943; Darley & 
Hagenah, 1955; Strong, 1943; Super, 1953), 
no available theoretical basis allows us to 
predict the scores a child will obtain as an 
adult on the Strong Vocational Interest Blank. 
The purpose of this paper is to suggest briefly 
an approach to vocational interest theory con- 
struction, and then in more detail to describe 
the development of an instrument devised to 
aid in such theoretical exploration. 

Scores on vocational interest blanks are 
closely and intimately related to other meas- 
urable aspects of personality, and a theory 
of vocational interest development must be 
regarded as a part of the broader theory of 
personality development. The aspects of be- 
havior which are elicited through the use of 
vocational interest blanks are called voca- 
tional interests because the items in the blank 
tend to refer to vocations, and the develop- 
ment of the blank makes use of groups of 
persons segregated upon the basis of occupa- 
tion. Furthermore, these blanks for the most 
part are used to help persons make decisions 
concerning occupational choices. It is doubt- 
ful, however, if a structured organization of 
personality dynamisms or behaviors that can 
be called vocational interests exists within the 
individual apart and discrete from other pat- 
terns of personality organization. One might 
say that as we view personality through the 
lens of a vocational interest blank, and as we 
relate what we see to the present and future 
occupational behavior of the individual, we 
are observing vocational interests, but it is the 
lens and our purpose of observation rather 
than the personality structure that makes 
these vocational interests. 

If this definition of vocational interests is 
correct, then any theory which aids in the 
explanation of interests must make use of 


1This study was supported with funds from the 
Graduate School and the Office of the Dean of Stu- 
dents, University of Minnesota. 


327 


BERDIE 


University of Minn«sota 


concepts that also have power to explain the 
development of other aspects of personality. 
In other words, the concepts that best will 
explain the development of vocational inter- 
ests must also explain how this development 
is a part of the total development of person- 
ality. They must allow predictions to be 
made concerning vocational interests on the 
basis of various types of personal behaviors, 
and in turn must maximize the consistency 
between vocational interests and these other 
behaviors. 

A basis for constructing a theory of voca- 
tional interests might be found if two types 
of concepts were employed experimentally, a 
concept of dimension and a concept of proc- 
ess. We are accustomed to working with both 
types of concepts in psychology. Dimensions 
include such things as ability, sociability, 
rigidity, and masculinity-femininity. Proc- 
esses include such things as identification, re- 
pression, self-acceptance, and perception. We 
propose to test whether an analysis of voca- 
tional interests using a few carefully selected 
dimensions and processes will help explain 
some of the underlying source of variance of 
interests and personality. Dimensions we 
hope to eventually study include ability, 
masculinity-femininity, sociability, rigidity, 
and socioeconomic status. The processes to 
be considered include identification with other 
persons, self-acceptance, and perceptual dis- 
crepancies, the latter including discrepancies 
between self, others, and occupational stereo- 
types. 

The dimension selected first for study was 
that of masculinity-femininity. Much 
known about this dimension and, more im- 
portantly, it has been used frequently in theo- 
retical discussions of both vocational interests 
and personality. An individual’s behavior 
and his occupation are influenced by a va- 
riety of roles, roles defined by his race, his 
religion, his family, and his peers, but per- 
haps no role plays so important a part as the 
role defined by the person’s sex. Much of 


is 





328 


what a person does, even those things not 
directly related to sexual behavior, is influ- 
enced by his perception of how other persons 
of his sex behave and by what he considers 
appropriate behavior for persons of his sex. 

Counselors and clinical psychologists work 
with many persons who have failed to achieve 
a satisfactory definition of their sexual roles. 
Many men are reluctant to show affection, 
to accept their own emotional experiences, 
and to modify their own dominant and sub- 
missive behavior patterns appropriately for 
given situations. Many men and women face 
problems of vocational choice related to con- 
fusion of sexual role. 

Several methods are available for measur- 
ing psychological masculinity-femininity. Per- 
haps the most complete and comprehensive 
report of a measuring instrument is contained 
in Terman and Miles’ book (1936). The 
Strong Vocational Interest Blank, the Minne- 
sota Multiphasic Personality Inventory, and 
the Rorschach all provide scores related to 
masculinity-femininity. Other methods, both 
projective and inventory, have been developed 
to assess masculinity. All methods, however, 
which have been demonstrated to have both 
reliability and validity are rather cumbersome 
for use in large scale research designs, and the 
first problem faced was to develop a means 
for sessing masculinity-femininity which 
was fast, easy to administer, efficient, and 
possessing the necessary reliability and valid- 
ity for research purposes. 


The Development of an Adjective Check List 


The adjective check list has some advantages over 
certain other methods. It is easily administered and 
scored, it has been demonstrated to have some 
validity, and perhaps most importantly, Ss are not 
reluctant to perform this kind of task. Therefore, 
an adjective check list was prepared consisting, for 
the most part, of adjectives found in previous 
studies related to masculinity-femininity. The Gough 
Adjective Checklist was used as a starting point 
(Gough, 1955). This list contains 300 adjectives, 
many of which have been found at the University of 
California to be related to masculinity-femininity 
scores on the Strong Vocational Interest Blank, the 
Minnesota Multiphasic Personality Inventory, and 
the California Psychological Inventory. Review of 
other studies of masculinity-femininity resulted in 
the addition of a few items to the list. A check list, 
as finally used, consisted of 148 adjectives arranged 
in four columns. It is shown in Fig. 1. 


Ralph F. Berdie 


The standardization sample consisted of a group 
of 600 students asked to complete the check list in 
the summer of 1955, prior to their matriculation as 
freshmen at the University of Minnesota. This group 
of students was given the check list five times with 
five different instructions. First, they were instructed 
to check each adjective thought to apply to them- 
selves. Next, using another copy of the same check 
list, they were instructed to check those adjectives 
which described the kind of persons they would like 
to be. Then they checked adjectives thought most 
descriptive of the average person of their age and 
sex. Next were checked adjectives considered most 
descriptive of father and finally adjectives} consid- 
ered most descriptive of mother. Thus, five adjec- 
tive check lists were available for each S, a list for 
self, for ideal, for average, for father, and for mother. 

The standardization sample included 200 women 
freshmen in the College of Science, Literature, and 
the Arts, 200 male freshmen from that college, and 
200 male freshmen from the Institute of Tech- 
nology. These Ss averaged 18 years of age, almost 
all of them were graduates of Minnesota high 
schools, and almost all of them were relatively 
selected students coming from the upper one-half of 
their high school classes. About one-half came from 
metropolitan areas, the remainder from small cities, 
towns, and farms. 

The groups of 200 SLA men and of 200 IT men 
were each divided at random into two groups with 
100 in each group. The first 100 SLA men were 
known as the standardization SLA group, the first 
first 100 IT men were known as the standardization 
IT men, and the two reriaining maie groups were 
known as the nonstandardization groups. 

In order to develop the scale, the 100 SLA and 
100 IT men in the standardization groups were 
combined and compared to the 200 SLA women. 
Item response frequencies on the self-description 
were determined for the two groups on the 148 
items. When the significance of the differences was 
analyzed, 15 items were found to be checked more 
frequently by the men than by the women at a 
level of significance at .05, and 46 items were checked 
significantly more often by the women. Thus, 61 of 
the 148 items significantly differentiated between the 
two criterion groups. The significance of differences 
was checked using critical ratios, the Lawshe-Baker 
Nonomograph, and finally chi-square tests. 

The scoring scale finally adopted included 46 items 
which were given positive weights of one and 15 
items given negative weights of one. Given positive 
weights were Items 2, 8, 9, 18, 19, 26, 36, 38, 42, 
43, 44, 47, 48, 53, 54, 59, 62, 72, 76, 82, 83, 85, 
87, 89, 99, 102, 108, 109, 110, 111, 112, 115, 117, 
118, 119, 120, 125, 127, 129, 133, 133, 137, 138, 140, 
146, and 148. Given negative weights were 3, 11, 15, 
17, 33, 50, 69, 77, 80, 88, 104, 114, 126, 136, and 
145. The scale thus was really a femininity scale 
rather than a masculinity scale insofar as most of 
the items which determined the score were items 
which were marked more characteristically by women 
than by men. The possible range of scores was from 





nN 


Femininity Adjective Check List 


active dependent shallow 
1 38 112 


sharp-witted 
113 

shrewd 
114 


D 


luxury-loving 
15 


affectionate determined mannerly 
2 39 76 


aggressive distrustful masculine 
3 40 11 


alert dominant mature 
4 41 78 
methodical 
79 


mild 
80 


shy 
115 


ambitious dreamy simple 
5 42 116 

sincere 
117 

slow 
118 


soft-hearted 
119 


anxious effeminate 
6 43 


argumentative emotional moderate 
7 “4 81 


appreciative enterprising modest 
8 45 82 


artistic fair-minded nervous spontaneous 
9 46 83 120 


assertive feminine noisy steady 
10 47 84 121 


athletic flirtatious obnoxious 
ll 48 85 
organized 


outgoing 
87 


stolid 
122 

autocratic forceful straightforward 
12 49 123 

boisterous 
13 

bold 
14 


calm 
15 


foresighted strong 
50 124 


fussy out-of-doors submissive 
51 88 125 


gentle outspoken suspicious 
52 89 126 


capable graceful painstaking sympathetic 
16 53 90 127 


cautious gracious patient 
17 «4 


91 
peaceable 
92 


tactful 
128 

charming 
18 


cheerful 
19 


greedy temperamental 
55 129 


tense 
130 


thorough 
131 


hasty persevering 
56 93 
civilized 
20 
clearthinking 
1 


clever 
22 


helpful planful 
57 94 


hostile precise thoughtful 
58 95 132 


humorous progressive thoughtless 
59 96 133 


coarse imaginative timid 
23 60 14 


tolerant 
135 


rational 
97 


reckless 
98 


refined 
99 


cold impatient 
ry 61 


commonplace impulsive tough 
25 62 136 


complicated independent resentful trusting 
26 63 100 137 


Honongaoaoagaeagaaaeeaeaaecetgaoagceaoacge eG GG @ 


confident industrious reserved unaffected 
27 64 101 138 


rm 
WwW 


unambitious 
139 


understanding 
140 


conscientious initiative restless 
28 65 102 


— 
WJ 


conservative insightful robust 
29 66 103 


considerate intelligent rough unemotional 
30 67 104 141 


contented interests narrow unkind 
31 68 


rude 
105 142 


ODooaoodondooanoanaanandnaanaanaanananaanaaaaoncnodadanod 


conventional interests wide self-centered versatile 
32 69 106 143 

cool intolerant self-controlled vigorous 
33 70 107 


144 


selfish virile 
108 145 


sensitive warm 
109 146 


sentimental weak 
110 147 


serious worried 
111 148 


courageous jolly 
34 7 


cruel kind 
35 72 


O 
O 
O 
O 
O 
O 
O 
O 
Oo 
O 
O 
O 
O 
Oo 
Oo 
Oo 
O 
O 
Oo 
O 
0 
0 
O 
O 
0 
Oo 
O 
Oo 
O 
O 
O 
Oo 
O 
0 
O 
0 


Hoaonondoanoanonoananaanaanaonoanoononooaonondoanoaoanoaoaoondwoononoodnonond dan 


oo & 2 a Gog 


curious leisurely 
36 73 


0 
Oo 
O 


demanding logical 
37 74 


Fic. 1. Adjective check list from which masculinity-femininity score is derived 





330 


46, the most feminine score, to minus 15, the most 
masculine score, or a total range of 61. 

After the development of the scale, all the adjective 
checklists for all of the groups were scored. 


Scores on the Scale 


Mean scores and standard deviations for 
the various groups on the self-descriptions 
are shown in Table 1. 

For the groups upon which the scale was 
developed, the mean score for the 200 SLA 
women was 16.7 as compared to the means 
for the male groups of 7.9 for SLA men and 
7.7 for IT men. Thus, the means for the 
men and women are more than a standard 
deviation apart. The mean of the 100 non- 
standardization SLA men was 10.7 and for the 
100 nonstandardization IT men, 8.3, thus 
showing some shrinkage when the scale was 
applied to groups upon which it had not been 
developed. A group tested in 1957, to be 
described later, provides further information 
about such shrinkage for both sexes. 

The scores of the women ranged from U 
to 31, the scores of the men from minus 6 to 
28. The median women’s score was 17 and 
of the 400 men, only 22, or 5%, obtained 
scores above 17. The median score for the 
men was 8, and only 7% of the women re- 
ceived scores below 8. 

Analysis of variance was used to determine 
the significance of differences between various 
groups. Among the groups of men, three dif- 
ferences were found to be statistically signifi- 
cant at the .001 level. The nonstandardiza- 
tion SLA group received more feminine scores 
than did the SLA standardization group, and 


Table 1 
Means and Standard Deviations of Self-Description 


Femininity Score for Groups ‘Tested 


Mean 


v 
SS 
S 


Group 


1955 SLA standardization men 
1955 
1955 SLA nonstandardization men 
1955 IT nonstandardization men 
1955 SLA women 

1957 SLA men 

1957. IT men 

1957 SLA women 

1957 Homosexual men 


IT standardization men 


J 


mu wn 
cre NR Ee MN NH NW UI 


_ 


a 


wn 


.iors discriminate between 


F 


Ralph F. Berdie 


the nonstandardization SLA group obtained 
more feminine scores than both the stand- 
ardization and nonstandardization IT groups. 
Variances were homogeneous. 


Reliability of the Femininity Scale 


The self-description adjective check lists for 
the 200 nonstandardization IT and SLA men 
were scored to provide split-half scores. One 
half consisted of the weighted items checked 
in Columns 1 and 3 and the other was based 
on items checked in Columns 2 and 4. The 
correlation between scores for these two halves 
for the 200 men was .45. The same 200 
adjective check lists were scored for odd and 
even items, and this provided a correlation of 
49. In December 1955, 95 men entering the 
College of Science, Literature, and the Arts 
completed the adjective check list and one 
day later completed the b'»nk again. The 
test-retest correlation was 8&1. 

The two low correlations suggest more 
about the nature of the adjective check list 
than a lack of reliabiiiiy. The items that dif- 
ferentiate men and women may be consistent 
or inconsistent, but low internal reliability, as 
compared to the higher test-retest reliability, 
suggests that many of these items are not re- 
lated one to another. A low odd-even relia- 
bility well might be obtained if one used 
instead of adjectives such things as length of 
hair on the head, tendency toward baldness, 
height, and ratio of weight and chest measure- 
ments. These are all indices that discriminate 
reliably between the sexes but which tend to 
have very little correlation among themselves 
within a given sex. In the adjective check list, 
30% of men and 60% of women checked the 
adjective “cool,” and 44% of men and 30% 
of women checked the adjective “aggressive.” 
Both of these adjectives discriminate between 
sexes, but either among men or among women 
the correlation between these two items might 
well be zero. A relatively low odd-even cor- 
relation suggests that many kinds of behav- 
men and women 
and that these behaviors are not all related. 


Relationships between Femininity Indices 


For most Ss, additional data were available, 
and the relationships between adjective check- 
list scores and other data were observed. The 





A Femininity Adjective Check List 


Table 2 


1 3 


. MMPI -. 44 
. SVIB — 31 — 40 
. Self .20 
. Ideal 30 
. Average 18 
. Father 19 
. Mother 14 — .06 


Correlations between Femininity Indices for Groups of SLA and IT Nonstandardization Men 


4 5 6 7 
14 02 13 
me 
39 21 33 
A7 ; A7 

37 39 
A3 
48 


Note.—Correlations in upper right segment of table for 81-100 SLA men, in lower left segment for 76-100 IT men. 


correlations are presented in Table 2, the 
upper right segment of figures pertaining to 
the nonstandardization SLA men, the lower 
left group of figures to the nonstandardization 
IT men. In this table, a correlation of .28 is 
statistically significant at’the .01 level, .22 
at the .05 level. 

For the SLA men the self-description cor- 
relates significantly with Mf score on the 
Minnesota Multiphasic Personality Inventory 
and with the masculinity score on the Strong 
Vocational Interest Blank. Only the latter 
correlation is significant for the IT men. The 
one correlation that shows a large difference 
between the two groups is that between the 
score based on self-description and the de- 
scription of father, where the correlation for 
the IT men is .71, for the SLA men, .21. In 
spite of the great statistical significance of 
this difference, a repeat of this analysis two 
years later provided correlations with no sig- 
nificant difference. These correlations did 
suggest, however, that the perception of father 
in terms of masculinity-femininity is some- 
what different for engineering freshmen than 
for Arts College freshmen 

The scores for self and for ideal are rather 
highly correlated for both groups, with the 
self-average correlation being somewhat lower 
These people describe themselves as being 
more similar to their ideal than they do to 
their perception of the average person of their 
age and sex. 

The two correlational matrices for the seven 
variables were compared by transforming 
v l+r ’ 

7 ¥, loge 7 _ 7 and comparing 
the differences between 2’s. 


the r’s to z (2 


The observed ¢ 


was not significantly different from zero, 
p = .56, thus supporting the assumption that 
the two sets of correlations were similar. The 
matrix for IT men then was factor analyzed, 
using Thurstone’s (1947) centroid method. 
The factor analysis was iterated once in order 
to stabilize the communality estimate. Three 
factors were extracted. These three factors 
were rotated obliquely and the loadings on 
the rotated reference vectors were: 


Reference vector 
Variable A B Cc 


1 02 55 21 


04 01 
69 18 31 
60 1 04 
68 09 13 
89 05 02 
62 01 01 


Correlations between the primary 
were: 


A&B .28 


vectors 


A&C 07 B&C —.15 


The test indices thus form one cluster, the 
adjective indices another. A factor analysis 
of the five adjective indices alone” provided 
two factors which correlated .60. Self and 
ideal tended to cluster, the others did not. 


Application of the Scale to Another Group 


A group of entering freshmen similar to 
those studied in 1955 was given the check 
list in 1957. This was done in order to repli- 
cate certain portions of the study, to obtain 
information concerning additional variables, 
and to obtain a further idea of the shrinkage 
using groups other than the ones upon which 





332 


the scale was standardized. The means and 
standard deviations for the three 1957 groups 
are also presented in Table 1. There it will 
be seen that the mean score for 1957 SLA 
women was about one point lower than the 
mean for the group upon which the scale had 
been standardized. Thus, relatively little 
shrinkage was found. The mean of the 1957 
SLA men was between the means of the two 
SLA groups in the earlier year, and the same 
was true for the 1957 IT group. These 
means, obtained for the various groups dur- 
ing the two years, and the figures pertaining 
to overlap indicate that this is a reasonably 
valid means for assessing the psychological 
differences between men and women of this 
age. The difference between 1957 SLA men 
and 1957 SLA women provided a critical 
ratio of 11.22, significant at the level of .001. 

This sample provided an opportunity for 
us to study further the relationships between 
“self” scores and other scores on the adjective 
check list and to give particular attention to 
the correlations which in the 1955 sample 
distinguished between the SLA and IT groups. 
Table 3 summarizes these correlations. 

The “self” and “ideal” correlations were 
relatively constant. However, the large dif- 
ference in correlations between self-descrip- 
tion and description of father for the two 
college groups disappeared. In the 1955 
samples, these two correlations were .21 and 
.71; in the 1957 samples, .45 and .42. Thus, 
what was a statistically significant difference 
upon replication was reversed in direction and 
no longer significant. 


Table 3 


Correlations between Selected Femininity Indices for 
Groups of SLA and IT Nonstandardization Men 
in 1955 and SLA and IT Men in 1957 


SLA 
Variables 
Correlated 


1955 1957 


N 100 85 
Self and ideal Ss" ss 
Self and father Bs ag 45** 
Self and mother .33** .46** 


* Significant between .05 and .01, 
** Significant beyond .01, 


Ralph F. Berdie 


The two correlations for the SLA groups 
were different at a level significant at .07. 
The difference between the two IT correla- 
tions was significant beyond .01. Thus, the 


difference between the two original correla- 
tions, .21 and .71, changed because of what 
appear to be significant changes in both 
groups with the greatest change in the IT 
group. 


Scores of Homosexual Men 


Masculinity-femininity indices have been 
validated not only on the basis of difference 
between men and women but also by cor 
trasting homosexual men with others. Th 
cooperation of Evelyn Hooker (1957) made 
possible the collection of data from a grou, 
of 43 homosexual males in California. These 
were men who were not institutionalized, whe ; 
were tending to make adequate occupational 
adjustments, and who had participated as Ss 
in other experiments. 

The median age of this group was 33 years, 
with a Q; of 28 and a Q3 of 45 years. Of the 
group, 88% were high school graduates, 67% 
had gone beyond high school, 33% were col- 
lege graduates, and 5% had gone beyond col- 
lege. Thus, the homosexual group differed 
from the other male groups for whom data 
were available on the basis of age, education, 
and geography, as well as on the basis of 
sexual behavior and preferences. The differ- 
ence in mean scores might result from any 
single difference or combination of these dif- 
ferences. 

The mean femininity score for the homo- 
sexual group, as shown in Table 1, was 18.9 
—higher than the means for the two groups 
of college women and approximately two 
standard deviations above. the means for the 
college men. Of the homosexual group, 14, 
or 33%, obtained scores below 17, the me- 
dian of the college women, and only 1 or 2% 
obtained scores below eight, the median for 
college men. 

Others have suggested a relationship between 
age and psychological femininity (Strong, 
1943), but the available data suggests this 
relationship is not sufficiently high so that 
the difference obtained here should be at- 
tributed to the greater age of the homo- 
sexual group. 





A Femininity Adjective Check List 


The correlations between the self-descrip- 
tions and indices based on the other adjective 
checklists were compared for the homosexual 
group and the IT and SLA men. These cor- 
relations were: 


Self and 
Father 


Self and 
Average 


Self and Self and 

Ideal Mother 
Homosexual 47 .23 31 .26 
SLA 53 39 a 33 
IT .63 J 71 49 


Both college groups described themselves as 
being more similar to their ideal and to their 
perception of the average person of their age 
and sex. No such consistent trends were 


found for parental descriptions. 


Summary and Conclusions 


An adjective check list ‘scale was developed 
to provide an easily obtainable index of psy- 
chological masculinity-femininity. The de- 
rived scale was based on 61 items included in 
a list of 148 adjectives. Only a minute or 
two is required to check the list by most 
Ss. The index substantially distinguishes be- 
tween groups of male and female college 
freshmen, and between a group of homo- 
sexual men and male college freshmen. The 
nonunitary character of the scale is revealed 
by léw intrascale correlations. The higher 
test-retest reliability and the higher inter- 
scale correlations suggest that the index is 
reliable enough for the kinds of group re- 


333 


search for which it was developed. The scale 
is not presented as an instrument to be used 
for purposes of individual diagnosis. 

Correlations between this index and mas- 
culinity scores on the Strong Blank and the 
MMPI are in the expected direction and are 
statistically significant, but the order of these 
correlations suggests the variables measured 
by the three instruments are not the same. It 
is hoped that the adjective check list will in- 
crease our understanding of the other two 
well established instruments and further our 
knowledge of the psychological meaning of 
the masculinity-femininity variable. 


Received December 17, 1958. 


References 


Bordin, E. S. A theory of vocational interests as 
dynamic phenomena. Educ. psychol. Measmt., 
1943, 3, 49-65. 

Darley, J. G., & Hagenah, T. Vocational interest 
measurement: Theory and practice. Minneapolis: 
Univer. Minnesota Press, 1955. 

Gough, H. G. Reference handbook for the Gough 
Adjective Checklist. Berkeley: Univer. California 
Inst. of Pers. Assessment Res., 1955. 

Hooker, E. The adjustment of the male 
homosexual. J. proj. Tech., 1957, 21, 18-31. 

Strong, E. K., Jr. Vocational interests of men and 
women. Stanford Univer. Press, 1943. 

Super, D. E. A theory of vocational development 
Amer. Psychologist, 1953, 8, 185-190. 

Terman, L. M., & Miles, C. C. Sex and tempera- 
ment. New York: McGraw-Hill, 1936. 
Thurstone, L..L. Multiple factor analysis. 

cago: Univer. Chicago Press, 1947. 


overt 


Chi- 





Journal of Applied Psychology 
Vol. 43, No. 5, 1959 


PREDICTION OF CONSUMER PURCHASE AND THE 
UTILITY OF MONEY '’ 


LYLE V. JONES 


University of North Carolina 


‘The model for prediction of consumer choice 
presented by Thurstone (1945; 1951) repre- 
sents a significant advance toward the applica- 
tion of subjective measurement theory. Using 
that model it becomes possible to predict, from 
results of psychological scaling analysis, aspects 
of actual behavior of a group of consumers. 
The model has obvious relevance to problems 
of marketing and prediction of voting, to name 
but two problem areas. 

While Thurstone presented prediction pro- 
cedures in a nonparametric form, they may be 
easily extended in terms of the parametric 
preference model provided by the method of 
successive intervals (Adams & Messick, 1958). 
Such an extension has been derived by Bock 
(1956). In this extended form, the model 
allows consideration not only of differential 
preference for a group of competing consumer 
goods, but also inclusion of indices of the popu- 
larity of the prices at which each of the con- 
sumer items is offered (Jones, 1956). 


The Model 
Prediction of Choice 


In accordance with the sealing model, as- 
sume that the hypothetical preference score of 
the ath respondent for the 7th item may be 
written 

Vit Cia [1] 
where Y; is the population mean preference 
score for Item 7, and e,, is arandom component 
attributable to individual differences in prefer- 
ence, distributed normally with zero mean and 
variance of ¢7 in the population of Ss. Assum- 
ing preference scores for any pair of consumer 
items to be uncorrelated, the normalized differ- 


! This paper reports research sponsored, in part, by 
the Quartermaster Food and Container Institute for 
the Armed Forces. 


The views or conclusions contained 
in this report are those of the author. They are not to 
be construed as necessarily reflecting the views or in 


dorsements of the Department of Defense. 


ence between an individual’s preference for a 
pair of items, 7 and J, is 


Zija has the unit normal distribution. Further, 
the joint distribution of 2,;2, Zica is bivariate 
normal with mean of zero, variance of unity, 
and correlation coefficient of o,? (Bock, 1956). 
That is 


N(O, 0, 1, 1, o;?) 


f (Zija Zika) == [3] 

From the method of successive intervals may 
be obtained estimates of Y;, Y;, Vx, «2, 07, 
and o,? for any three consumer items. Utiliz- 
ing those estimates (distinguished by a carat, A) 
define 


[5] 


L 


Then the probability that an individual would 
choose Object % over both competing Objects 
j and k is given by 


PUii>jfhk) 


f f T (Size, Sika) AZijadZika [6] 
cji J cki 


This integral may be evaluated with use of 
tables for determining the volume of quadrants 
of the bivariate normal distribution function 
(Pearson, 1931). In the general case of pre- 
dicting choice of one from m competing objects, 
evaluation of the multiple integral is possible 
by reduction methods (e.g., Plackett, 1954). 


Prediction of Purchase 


Assume that the competing consumer items 
are differentially priced. Then the preference 
score of the ath individual for the ith object 


334 





Prediction of Consumer Purchase 335 


offered at Price p may be written 


Xipa = Vit Uy t Cia L7] 
where Y; is the population mean preference 
score for Item i, U, is the population mean 
utility of Price p, and ej, as before, is a random 
component distributed as N (0, o,). 


Analogously to Equation [2], define 


Zip,jqae 
_[Xipa— (Vit U 9) J—LX iva (Vj4+U,)] 
Vo?+o;? 


[8] 

Under the previous assumptions, and analo- 
gous to [3] 

I (Sip jaa) Zip, k ra) = N(0, 0, i. : g;") 


[9] 


To express the probability that an individual 
would purchase Object i at Price p rather than 
Object j at Price g or Object k at Price r, we 
again evaluate an integral of the form of [6], 


P(ip > jqQh kr) 


wo x 
= J} (Zip, je, a3 Zip, kr, a) 
Cjq. i Ckr, ip 


x dz» 74%: Zig, kr,a 


[10,] 


where the lower limits of integration are 
and 


However, is: this case we lack empirical esti- 
While the method of 
successive intervals provides the estimates Y; 


mates of all parameters. 


and 62, the U values remain unknown. 

With three competing consumer objects and 
empirically known proportions of purchase of 
each, there are three equations of the form of 
Equation [10 ], one yielding P¢ip>jgnxr), another 
P (jq>ipnkr), and the third, (not independent), 
P kr>jqnip)» Using these values, tables of the 
bivariate normal distribution function allow 
iterative solutions for the three estimates of 
utility, U,, U,, and U,. Finally, if data are 
available for several sets of three competing 
consumer objects, each set containing one ob- 


ject at Price p, one at Price g, and one at Price 
r, then several estimates may be found and 
checked for consistency, for each utility of 
price. 

The Application 

For this study competing entrees on a 
luncheon menu serve as stimuli. A seven- 
category successive category rating scale was 
mailed to each of the 430 faculty members who 
were also active members of the faculty club at 
the University of Chicago. ‘The addressee was 
instructed to complete the form by placing a 
check mark to indicate the degree to which he 
liked or disliked each menu item. Included on 
the schedule were the names of the 15 entrees 
served at the club during a criterion period. A 
total of 297 completed forms were returned, 
comprising 69 per cent of those mailed. 

Five criterion days were selected. On no 
criterion day was there a shortage of a luncheon 
item at the club, and on each day more than 
100 members patronized the regular dining 
room facilities. The frequencies of purchase 
of the three competing luncheon entrees on 
each of the five days serve as criteria. 

From the preference ratings, approximate 
least squares successive intervals estimates for 
scale values and discriminal dispersions were 
obtained by a graphical method (Jones & 
Thurstone, 1955). Based upon these prefer- 
ence parameters, and upon the assumption of 
the norm ‘ity of distributions of preference 
along the underlying scale continuum, one may 
utilize Equation [6] to predict the proportion 
of consumers who would select each of three 
competing consmer objects. 
predicted prot appear in Table 1, 
Column A. A comparison of these predicted 
proportions with actual observed proportions 
of choice indicates that discrepancies are con- 
siderable. The average error in predicting 
proportions is .194. 

The relatively poor fit of predicted to ob- 
served proportions may be partially attribu- 
table to the differing prices at which entrees 
were sold. On each criterion day, one of the 
three entrees was offered at $1.20, one at be- 
tween $.95 and $1.05, and the third at between 
$.80 and $.90. For convenience each of the 
three price levels is considered homogeneous, 
best represented by the prices $1.20, $1.00, 
and $.85. 


The resulting 
ions 





Lyle V. Jones 


Table 1 


The Menu Items, Their Prices, and Observed and Predicted Proportions of Purchase for Each 


(N is the number of luncheon patrons, and serves as the base for the observed proportion) 


Entree 


Roast round of beef 
Smoked tongue 


Creamed mushrooms on toast 


Fried chicken leg with country gravy 
Meat loaf with brown gravy 
Welsh rarebit on toast 


Roast leg of lamb 
Smoked Thuringer sausage 


French fried smelts with tartar sauce 


Roast leg of lamb 
Braised ox joints 
Baked beans 


Roast round of beef 
Creamed chicken with hot biscuit 
Apple fritters, bacon, and syrup 


Mean d(pred-obs) 


Using the prediction of purchase model 
specified by Equation [10] above, iterative 
solutions for the utilities of the three price 
levels were obtained for each of five sets of 
three equations. Each set of three equations 
provides a unique estimate for U’.s5, U'1.00, and 
U;.2, on a scale with an arbitrary zero. The 
obtained estimates appear in Table 2. It will 
be noted that the most divergent values are 


—————— + 


$.85 $1.00 


Price 


Final estimates of utility of the three prices. 


Proportion 
Observed 
Choice 


Predicted Proportions 


405 
319 

276 
215 
505 
.280 
.268 


342 
390 


441 
304 
255 
295 
439 
.266 


those for Criterion Day 3. The lowest cost 
entree on that day is french fried smelt. That 
the day was a Friday appears to have added a 
determinant of purchase which is not included 
in the model. 

It is also of interest to examine a plot of the 
mean utilities from Table 2 (Fig. 1) to deter- 
mine relative strength of negative utility for 
the three prices. The values of U5 and U;.29 
are consistently more negative than Uy 


Table 2 
Estimates of Utility Values 


(U’ 45 is arbitrarily assigned zero utility 


Criterion 
Day U; 2 


me Wh = 


Mean 





Prediction of Consumer Purchase 


While $1.20 is the least preferred price, $1.00 
is a price preferred to $.85. In other words, 
in this study, utility of price is not monotoni- 
cally related to price. 

Utilizing the mean values for the three 
utilities, final predictions are made, the results 
of which appear in Column B, Table 1. The 
improvement of fit is demonstrated by the 
relatively small average discrepancy of pre- 
dicted from observed proportions, .031, and 
lends credence to the model. 

The finding that faculty members, when 
lunching at the faculty club, prefer paying 
$1.00 to paying $.85 may come as a surprise. 
However, we might conjecture that the social 
psychology of publicly ordering lunch at a 
table with colleagues provides a disposition 
away from the cheapest meal. The present 
study, of course, provides no evidence as to the 
source of the finding. Nor may we legiti- 
mately generalize the findings to any other 
situations. Nevertheless, it might not be sur- 
prising to find such nonmonotonic relations 
between price and utility of price for numerous 
consumer commodities: for cosmetics, articles 
of clothing, household drug supplies—indeed, 
for any items where the consumer evaluation 
of quality is difficult or impossible to make 
independent of price. 

The method of measuring utility of money 
illustrated in this study can be characterized 
as involving two important restrictions: (a) the 
concept measured is one of group utility rather 
than individual utility ; (6) the method involves 
inferred rather than direct measurement, in 
this case measurement inferred from a prefer- 


337 


ence model and from proportions of choice of 
alternative items. With respect to the first 
restriction, it should be recognized that similar 
methods might be adapted to the study of in- 
dividual utility. A population could be de- 
fined by the response repertory of a single in- 
dividual. A study of economic indifference 
functions by Thurstone (1931) illustrates this 
point. As for the second restriction, it empha- 
sizes the need for cross-validation, i.e., for 
evaluation of obtained estimates via prediction 


of results for experimentally independent data. 


REFERENCES 


Apams, E., & Messick, S. An axiomatic formulation 
and generalization of successive intervals scaling 
Psychometrika, 1958, 23, 355-368. 

Bock, R. D. A generalization of the law of compara 
tive judgment applied to a problem in prediction of 
choice. Amer. Psychologist, 1956, 11, 443. (Ab 
stract) 

Jones, L.V. Prediction of consumer purchase 
Psychologist, 1956, 11, 443. (Abstract) 

Jones, L. V., & Tuurstong, L. L. 
of semantics: An experimental investigation. J. 
appl. Psychol., 1955, 39, 31-36. 

PEARSON, K. Tables for statisticians and biometricians 
Part II. London: Biometric Lab., 
College, 1931. 

PLACKETT, R. L. 
multivariate integrals. 
300. 

TuurstTone, L. L. The indifference function 
Psychol., 1931, 2, 139-167. 

TuursToneE, L. L. The prediction of choice 
metrika, 1945, 10, 237-253 

TuursTone, L. L. An experiment in the prediction of 
choice. Univer. of Chicago, Psychometric Lab. Res. 
Rep., 1951, No. 68. 


Amer. 


The psychophysics 


University 


\ reduction formula for normal 
Biometrika, 1954, 41, 351 


Psycho 





Journal of Applied Psychology 
Vol. 43, No. 5, 1959 


CATEGORIES OF THOUGHT OF MANAGERS, CLERKS, 
AND WORKERS ABOUT JOBS AND PEOPLE 
IN AN INDUSTRY ' 


HARRY C. TRIANDIS ? 


Cornell University 


Recent work in perception (Hayek, 1952) 
and thinking (Bruner, Goodnow, & Austin, 
1956) has used categorization as one of the 
main units of analysis. Kelly (1955) has 
stressed the usefulness of the knowledge of 
the patient’s categories of thought, or personal 
constructs, in clinical therapy. Triandis 
(1958) has shown that the more similar the 
categories of thought employed by two people, 
the more likely it is that they will communi- 
cate and the greater the likelihood that they 
will like each other. 

The present report describes a method for 
obtaining the categories of thought of Ss, pre- 
sents lists of these categories obtained from 
various groups in industry, and attempts to 
assess the significance of these differences. It 
concentrates on only two cognitive domains: 
jobs and people. 


Method 
Procedure 


Triads of jobs (or people) were presented to the 
Ss, who were asked “Which one of these three jobs 
(or people) is more different from the other two?” 
and “Why?” The adjectives or characteristics of 
the jobs (or people) which were obtained in this 
way, together with their opposites, also supplied by 
the Ss, are the categories of thought of the Ss. The 
categories were obtained in group sessions with about 
5 to 15 Ss, of the same status, attending each ses- 
sion. All Ss were given 12 triads of jobs and 12 
triads of people. The triads were formed with the 
following jobs: the present job, a previous job, the 
job that S would be likely to be doing if he did 
not have his present job, a job S would like to have, 
a job S hopes to have some day, a job that S con 
siders very useful, a job that would make S$ very 
happy, a job S considers very interesting, and the 
best paying job that S thinks he will have some day. 


1This paper is based on portions of the writer's 
doctoral dissertation. The author gratefully ac- 
knowledges the guidance and help of W. W. Lam- 
bert, T. A. Ryan, and W. F. Whyte. The larger 
study, of which this is a part, was supported by a 
grant from the Foundation for Research on Human 
Behavior. 

2 Now at the University of Illinois. 


The person triads were formed with the following 
people: the self, the S’s supervisor, a person liked 
by S, a hardworking fellow worker, a skillful fellow 
worker, a fellow worker devoted to his job, a fel- 
low worker for whom S feels sorry, a person who is 
well known throughout the company, and a success 
ful person who is personally known by S. No repe- 
titions were allowed in the choice of jobs and people 
to fit the above designations, or in the categories 
that were elicited from each triad. The Kelly (1955) 
group presentation format of the Repertory Test 
was used. The test sessions were of a 2-hr. duration 

Two sophomores, acting as clerical assistants, 
classified the categories obtained from each group 
of Ss as follows: A protocol was picked at random, 
and all the categories were recorded. Each of the 
categories of the second and all subsequent protocols 
were judged either similar or different from the 
categories already recorded. If they were judged to 
be similar (say, good-bad and moral-immoral are 
similar), a checkmark was entered next to the al- 
ready existing category. If they were judged to be 
different, they were recorded so as to constitute a 
new class of categories. Categories with more than 
one entry were called category-classes. 

The category-classes so constituted were then re 
arranged in “logical fashion” and formed category- 
groups; for instance, the category-classes clean-dirty, 
good-bad, beautiful-ugly, etc. formed the “evalua- 
tive” category-group. 


Location 


The study was conducted in a rural New York 
State community of about 2000 people, which we 
shall call Treeville. The company which was studied 
employs about 300 people. It manufactures auto- 
motive equipment requiring skillful engineering and 
production and markets it nationally. The head- 
quarters are in Treeville 


Subjects 


Usable categories of thought were obtained from 
17 department heads and vice presidents (upper 
management—[UM]), 20 section heads and foremen 
(other management—[M]), 27 female clerks (FCL), 
20 male clerks (MCL) and 21 workers (W). The 
background of the lower management, the female 
clerks, and the workers is predominately rural. 
Many workers were unsuccessful farmers before tak- 
ing a job with the company. Only the highly skilled 
had been industrial workers all their lives. On the 
other hand, the background of the male clerks and 





Categories of Thought About Jobs and People in Industry 


the upper managers is predominately urban. The 
average age of the workers is much higher than that 
of the other groups (32% were older than 51 years; 
only 13% of the top managers were older than 50). 
The average years of residence in the vicinity of 
Treeville was much higher for the workers (20 
years) than for any of the other groups. 


Results 


The complete lists of the categories that 
were obtained are available elsewhere (Tri- 
andis, 1958, pp. 124-134). Separate lists are 
available for the upper managers, the lower 
and middle managers, the female clerks, the 
male clerks, and the workers. Most of these 
lists contain more than 200 categories, ar- 
ranged in more than 100 category-classes. In 
order to reduce the number of category- 
classes in each list, the number of categories 
in each category-class was considered. Since 
the number of Ss, the number of categories 
obtained, and the number of category-classes 
vary from group to group, it was considered 
desirable to use a cutting point that will con- 
sider all three variables. Such a criterion was 
obtained by means of an adaptation of the 
chi-square test. The criterion is equivalent 
from group to group because it takes into 
consideration the three variables mentioned 
above. 

Tables 1 and 2 present the most frequently 
used category-classes for people and jobs. 

Table 3 presents the category groups with 
the highest frequency of entry. 


3 If N Ss gave m categories of thought and these 
, categories were classified in m category-classes (of 
course, m <_m), then if there is an equal chance that 
a given category of thought will be classified in any 
of the m category-classes, there will be m/n cate- 


gories of thought in each category-class. Further- 
more, with m category-classes we can do n chi-square 
tests, by comparing the frequencies in each category 
class with the frequencies in all the other category- 
classes combined. The level of significance of our 
chi-square tests should be .05/n (this is a conserva- 
tive way of taking care of the multiple tests). Thus, 
if V is the value of the chi square that is significant 
at that level, we will have 


. (F — m/n)? y 
. m/n 

where F is the observed frequency in a given cate- 
gory-class. If we solve for F we can state that any 
category-classes that have frequencies that exceed F 
are to be retained in our summary lists. This pro- 
cedure retains only the most frequent category- 
classes, considers the number of Ss, the number of 
categories, and the number of classes and establishes 
a criterion that is equivalent from group to group. 


Discussion 
The Broad Categories (Category-Groups) 


Table 3 suggests that when members of 
lower management perceive people, in the in- 
dustrial situation, they tend to use power 
categories—is this fellow a supervisor? Does 
he have authority? Is he an executive? 
These are the important dimensions for this 
group. Our interpretation is that the transi- 
tiveness of the lower management’s position 
in the hierarchy of the organization makes 
power an important broad category, whereas 
for those who are at the bottom, and for those 
who “have arrived,” this category is not as 
important. 

Evaluation and, to a lesser extent, activity 
seem to be fundamental categories of perceiv- 
ing people, since they are used by all the 
groups significantly frequently. In other 
words, when the Ss perceive another person, 
they tend to evaluate him—how good, fair, 
capable, cooperative, nice person, well liked, 
sociable, etc. is he? They are also concerned 
with his rate of activity; slow-fast, active- 
passive. Within the broad category of evalua- 
tion, however, there are differences between 
the groups. The upper managers emphasize 
background—college trained, educated, pro- 
fessional, good background—while the work- 
ers emphasize dependability—careful, safe to 
work with, thorough. The latter finding is to 
be expected in a shift shop much more than 
in other work situations; sloppy work of one 
shift may create problems in the next shift. 
Since the company’s factory operates on a 
two-shift basis, we would expect to get this 
dimension stressed in this factory, while in 
other factories some other aspect of evalua- 
tion may be more important. 

When perceiving jobs, job characteristics 
and job requirements seem to determine the 
field. Whether the job is buying or selling, 
production or planning, rare or common, in- 
volves personal contacts or no personal con- 
tacts, is seasonal or steady, etc. seems impor- 
tant. Among the requirements, intelligence, 
creativeness, ability to deal with details, edu- 
cation, etc. are important. Al! groups, except 
the workers, are concerned with the “power” 
aspects of the job—how much authority. The 
workers are concerned with whether the job 





Harry C. Triandis 


Table 1 
Category-Classes with the Highest Frequencies of Entries for All Groups of Subjects. Domain: People 


Upper Management Workers 


Introvert-extrovert* Slow-fast* 

Gracious-crude* Gets high pay-gets low pay* 

Experienced-inexperienced* Stable-unstable* 

Educated-uneducated* Quiet-loud 

Tall-short* Intelligent-unintelligent* 

Old-young* Skilled-unskilled* 
Worker-manager* 


Other Management 
Supervisor-nonsupervisor* Worker-manager* Office-factory* 
Intelligent-unintelligent* Experienced-inexperienced* 
Introvert-extrovert* 


Female Clerks 
Intelligent-unintelligent*** Ambitious-unambitious* Friendly-unfriendly* 
Nice-poor personality* Introvert-extrovert* Tall-short* 
Dependable-undependable* Young-old* Quiet-talkative* 
Quiet-loud* Executive-nonexecutive* 

Male Clerks 
Introvert-extrovert*** Old-young* 
Worker-manager* Office-factory* 
Experienced-inexperienced* Quiet-talkative* 


Friendly-unfriendly* 
Intelligent-unintelligent* 
Executive-nonexecutive* 


Note.—The number of asterisks indicate the importance of the category for the group. The categories are listed in order 
of importance. 


Table 2 

Category-Classes with the Highest Frequencies of Entries for All Groups of Subjects. Domain: Jobs 
Upper Management Workers 

High-low pay** High-low pay*** 
Much-little responsibility** Dirty-clean** 
Broad-specific duties** Skilled-unskilled* 
Executive-nonexecutive* Difficult-easy* 
Creative-uncreative* Requires much-little intelligence* 
Many-few personal contacts* 
Skilled-unskilled* 
Worker-manager* 


Female Clerks — 
Interesting-uninteresting*** 
Manual-mental work* Skilled-unskilled*** 
Office-factory** 

Many-few contacts with people** 
Employer-employee* 


Other Management 
Deals with details-generalities** Much-little responsibility** 
Skilled-unskilled* High-low pay* Planning-production* 
Executive-nonexecutive* Supervisory-nonsupervisory* 
Male Clerks 
Requires much-little responsibility*** High-low pay** 


Dirty-clean* 
Menta!-manual work* Employer-employee** 


Executive-nonexecutive* 


Note.—See footnote to Table 1. 





Categories of Thought About Jobs and People in Industry 


Table 3 


Category-Groups with the Highest Frequency 
of Entry 


People 
Upper Management: Evaluation,*** 

ground* 
Lower Management: Evaluation,*** Power** 
Workers: *** Activity,** De- 


Activity,* Back- 


Evaluation, 
pendability* 
Clerks: Evaluation,*** Activity** 


Jobs 

Upper Management: Job characteristics,*** Job re- 

quirements,* Power* 

Lower Management: Job characteristics,*** Power,* 
Job requirements* 

Job characteristics,** and 


Workers: Re- 


quirements,** and  Evalua- 


tion** 
Female Clerks: Job requirements,** and Charac- 
teristics,* and Power* 
Male Clerks: 


Job characteristics,** and Re- 


quirements,* and Power* 


Note. 


—See footnote to Table 1. 


is clean or dirty much more than any of the 
other groups. 


The Categories of Various Groups: For People 


Turning now to more specific aspects of the 
problem, we observe that the female clerks, 
when they perceive people in industry, tend 
to stress the dimensions which one might ex- 
pect are relevant to “eligible bachelors”: The 
most frequently used categories are intelli- 
gent, ambitious, nice personality, friendly, 
tall, dependable, young, etc. 

The reader is free to give his own interpre- 
tations to the lists presented in Tables 1 and 
2. The present writer’s interpretation of 
these lists is as naive as possible. He sees 
these lists as expressions of the concern of 
people about status. Status, however, is ex- 
pressed in different terms by each group. The 
upper managers use upper class criteria, such 
as graciousness, education, and polish. The 
lower managers make power distinctions 
(worker-manager, office-factory). The work- 
ers take the more ordinary approach to status, 
namely, money. This suggests, if wild specu- 
lation is permissible, that rewards such as 
membership to the country club would be 


341 


most effective with upper management, while 
lower management may simply settle for a 
nice sounding title and the feeling that they 
have power over others. Finally, the workers 
would probably settle for just money. Of 
course, if they did get money they would want 
power, too, and if they did get power, they 
would want membership in the country club. 
This suggests a sort of hierarchy of rewards, 
parallel to Maslow’s (1954) hierarchy of 
motives (survival, safety, belongingness, es- 
teem, and self actualization). It is the writer’s 
impression that top management is very well 
paid in relation to the income of the average 
resident of Treeville. An income of $5000 
per year is considered “making good money” 
~in town. The workers earn much less than 
that. On the other hand, the top managers 
earn at least twice and often three or four 
times this sum. Furthermore, their income 
exceeds that of even the wealthy residents of 
Treeville. In view of this difference, it seems 
reasonable that workers would consider 
“money he earns” as one of the dimensions 
of interpersonal perception in the factory, 
since this is a dimension of “relative depriva- 
tion.” The idea that “relative deprivation” 
may be a factor determining whether a dimen- 
sion will be used more often than other di- 
mensions by a particular group is strength- 
ened when we consider the lower management. 
This writer, in the course of interviewing the 
lower management, was left with the definite 
impression that this group was very much 
concerned with problems of authority. It 
seems that most decisions are still made at 
the top and the lower managers simply carry 
them out, often blindly. Whether or not a 
person has the authority to make decisions 
is crucial for lower management; it is a di- 
mension along which they are “relatively de- 
prived.” Our data show that authority, or 
power, is an important category of that group. 
On the other hand, their pay is around 
$5000, which is “good enough,” and so they 
do not use the dimension of pay in their 
interpersonal perception nearly as much as 
the workers. 

Another theme that runs through these 
categories is what Parsons and his associates 
(1953, 1955) call “instrumental versus ex- 
pressive.” Upper management and the clerks 





342 


are concerned more with expressive categories 
(gracious, friendly, extroverted, talkative, 
many-few personal contacts), while lower 
management and the workers are more con- 
cerned with instrumental categories (inrtelli- 
gent, skilled, dependable, stable). 

Certain categories are used by all groups; 
for instance, friendly-unfriendly, intelligent- 
unintelligent, skilled-unskilled, experienced- 
inexperienced, introvert-extrovert, old-young, 
cripple-fit, quiet-talkative, excitable-calm, 
slow-fast, worker-manager, office-factory. Some 
dimensions seem to be stressed almost exclu- 
sively by some groups; e.g., upper manage- 
ment stresses well-poorly groomed, educated, 
has university degree, gracious-crude, fast- 
slow thinker, staff-line, etc., lower manage- 
ment stresses supervisor-not supervisor, the 
workers stress quality worker, serious worker, 
good-bad, careful-careless, stable-unstable, 
alert-sleepy, low-high pay. 

Certain categories are used by women more 
often than they are used by men. Young,‘ 
working with triads of community organiza- 
tions, found such differences too. He found, 
for instance, that women stressed the religious 
character of the organizations more often than 
the men; men stressed more often whether 
the organization is all male or mixed. Simi- 
larly, in our lists we found many strictly male 
dimensions, such as knows-does not know his 
job, slow-fast, dependent-inde pendent, experi- 
enced-inexperienced, has much drive-is lazy, 
and some female dimensions, such as honest- 
dishonest, polite-impolite, dependable-unde- 
pendable, married-single, etc. It seems that 
the male dimensions are somewhat more in- 
strumental and the female dimensions some- 
what more expressive. This is consistent with 
Parson’s (1955) theorizing. 

When we undertook this research we did 
not expect to obtain any differences along the 
instrumental-expressive dimension. Since we 
did find some such differences, however, it 
may be worth exploring this aspect of our 
findings more than we have so far. One ques- 
tion of considerable interest is whether good 
leaders use both instrumental and expressive 
categories, as is consistent with the “great 
man” theory of leadership. Borgatta, Couch, 
and Bales (1954) seem to think so, and have 


#F. Young. Personal communication, 1957. 


Harry C. Triandis 


experimental evidence to support this view. 
Turning to our data we find that the top 
managers (who are presumably better leaders 
than their subordinates) used a much more 
balanced set of categories than any of their 
subordinates. This supports the “great man” 
theory of leadership. It also suggests that 
leadership training is, perhaps, a process 
whereby a person comes to see other people 
in terms of both instrumental and expressive 
categories. 


The Categories for Jobs 


All groups, except the women, consider pay 
an important characteristic of the job. Skill, 
undoubtedly interpreted variously by each 
group (administrative, manual, typing, etc.), 
is on all the lists except that of the male 
clerks. The nature of the work (creative, in- 
teresting) and certain expressive characteris- 
tics seem to concern the upper management 
and the female clerks. Status symbols, such 
as manual-mental, office-factory, dirty-clean, 
appear in some lists. The workers seem much 
concerned with job requirements, particularly 
intelligence, and seem to have a sort of “in- 
feriority complex” about intelligence. This 
came out also in interviews where the workers 
often told me: “I don’t think I have the edu- 
cation to understand this,” “I wish I were 
more intelligent,” “management knows about 
this, I don’t,” etc. The final observation is 
that the two management groups use cate- 
gories that are more similar to each other 
than to any of the other groups. 

In the case of jobs, just as in the case of 
people, certain categories were used by all the 
groups. These include indoors-outdoors, difji- 
cult-easy, skilled-unskilled, gets high-low pay, 
requires much-little education, involves much- 
no responsibility, desirable-undesirable, has- 
has no authority, employer-employee, travels 
a lot-stays put, routine-variable, interesting- 
uninteresting, manual-mental. 

Some of these categories are stressed more 
by one group, however, than they are stressed 
fy another. Some categories are predomi- 
nantly managerial ( planning-production, deals 
with theory-practice, requires good writing, 
policy making, has many-few contacts with 
people, creative, deals with generalities-details, 
sales-production), others are predominantly 





Categories of Thought About Jobs and People in Industry 


worker categories (careful-sloppy, requires ex- 
tensive training, dirty-clean, honest-dishonest, 
low-high type work, dressed up-in work 
clothes). 


General Discussion 


The reader must keep in mind the fact that 
if a certain characteristic is approximately 
equally distributed among the various ele- 
ments (people or jobs) involved in the triadic 
comparisons, this characteristic will not ap- 
pear in our list of Tables 1 and 2. This is 
simply because of the nature of the triadic 
procedure. Equal distribution of the charac- 
teristic implies that it can be completely ab- 
sent, it can be present, or it can be present 
in large amounts; as long as it is equally 
present, it will not appear. Thus, if you 
compare, in the triadic procedure, three mil- 
lionnaires, or three hobos, money will not be 
given as the characteristic along which these 
people are judged. When we find, then, that 
the managers do not consider “how much the 
man makes” as one of the important attri- 
butes of the man in industry, this does not 
imply that they do not consider this charac- 
teristic when they meet a millionnaire. 

What, if anything, is to be gained by ana- 
lyzing the categories of various groups? As 
has already been suggested, it may be shown 
that certain individuals, e.g., top management 
(which implies good leaders), use more of 
one type of category than another. But more 
important, perhaps, is the question of “what 
categories should a group use when trying to 
communicate with another?” 
example, that management is preparing a 
merit rating scale for use with workers. What 
categories should it use? 

Our analysis of one merit rating scale shows 
very little overlap of the categories used by 
workers in our study and the categories used 
on that merit rating scale. It is an empirical 
question whether workers judged on such a 
merit scale would find it just and fair. It is 
possible that a merit rating scale using some 
of the categories that we obtained from our 
triadic procedure would be more effective. 
Merit rating has, among other functions, the 
most important function of providing material 
for discussions on how the employee may im- 
prove. One suspects that trying to convince 


Suppose, for: 


343 


a worker to improve his industriousness will 
be less effective than trying to convince him 
to improve his skill, or his chances of advance- 
ment. 

Examination of the job evaluation plan of 
the National Electrical Manufacturers Asso- 
ciation (Tiffin, 1952) shows that the cate- 
gories they have used overlap very much with 
the categories found in our research project. 
They used skill, effort, job conditions, etc. 
This plan, then, is probably more effective, 
from the point of view of communication, 
than the previously mentioned merit rating 
plan. We advance the hypothesis then that 
if “management uses the worker’s dimensions 
in its communication with the workers it will 
be more successful in its communication.” 
The evidence presented by Triandis (1958) 
supports this hypothesis. 

It must be kept in mind, however, that the 
categories presented in Tables 1 and 2 apply 
to a particular plant, in a particular town. 
Only more research will answer such questions 
as whether or not the patterns that were ob- 
tained are stable, what is the influence of 
situational variables on the patterns, etc. 


Summary 


Triads of jobs and people were presented 
to 105 Ss. The Ss were managers, clerks, 
and workers in a small New York State indus- 
trial concern. The Ss were asked “Which one 
of these three jobs (people) is more different 
from the other two?” and “Why?” The char- 
acteristics that differentiated one member of 
the triad from the other members were listed. 
Certain differences in the lists obtained from 
the various groups were observed. An at- 
tempt was made to assess the significance of 
these differences for intergroup communica- 
tions in industry. 


Received December 15, 1958. 


References 


Borgatta, E. F., Couch, A. S., & Bales, R. F. Some 
findings relevant to the great man theory of leader- 
ship. Amer. sociol. Rev., 1954, 19, 755-759. 

Bruner, J. S., Goodnow, J. J., & Austin, G. A. A 
study of thinking. New York: Wiley, 1956. 

Hayek, F. A. The sensory order. An inquiry into 
the foundations of theoretical psychology. Chi- 
cago: Univer. Chicago Press, 1952. 





344 


Harry € 


Kelly, G. A. The psychology of personal constructs 
New York: Norton, 1955. 

Maslow, A. H. Motivation and personality. 
York: Harper, 1954 

Osgood, C. E., & Suci. G. L. Factor analysis of 
meaning. J. exp. Psychol., 1955, 50, 325-338. 

Parsons, T., & Bales, R. F. Family socialization and 
interaction process. Glencoe, Ill.: Free Press, 1955. 


New 


_ 


. Triandis 


Parsons, T., Bales, R. F., & Shils, E. A. Working 
papers in the theory of action. Glencoe, IIl.: Free 
Press, 1953. 

Tiffin, J. Industrial psychology. 
tice Hall, 1952. 

Triandis, H. C. Some cognitive factors affecting 
communication. Unpublished doctoral dissertation, 
Cornell Univer., 1958. 


New York: Pren- 





Journal of Applied Psychology 
Vol. 43, No. 5, 1959 


RELATIONSHIPS BETWEEN A TOP-MIDDLE MAN- 
AGEMENT SELF-DESCRIPTION SCALE AND 
BEHAVIOR IN A GROUP SITUATION ' 


LYMAN W. PORTER ann ROGER A. KAUFMAN 2 


University of California, Berkeley 


In a recent study by Porter and Ghiselli 
(1957) the self-perceptions of top and mid- 
dle management personnel were compared. 
Using a self-description inventory developed 
by Ghiselli (1954), it was found that 21 of 
64 pairs of forced-choice adjectives signifi- 
cantly differentiated between the two groups. 
An analysis of the differentiating items indi- 
cated that top management personnel, when 
contrasted with middle management person- 
nel, viewed themselves as being more self-re- 
liant, self-confident, enterprising, and bolder. 
The middle management individuals, on the 
other hand, described themselves as more 
careful, thoughtful, deliberate, and controlled. 

Ghiselli and Lodahl used the 21 differenti- 
ating items in the Porter and Ghiselli study 
to form a scale which they termed a Deci- 
sion-Making Approach (DMA) scale (1958a, 
1958b). Although the scale is composed of 
the items that differentiated top from middle 
management personnel, it was given the name 
Decision-Making Approach because many of 
the items seemed to describe how individuals 
in these groups might approach the decision- 
making process if their behavior fitted their 
self-descriptions. Ghiselli and Lodahl (1958a, 
p. 63) point out, however, that “this scale 
probably measures more than just the ap- 
proach to decision making. ‘ 

The DMA scale was first used by Ghiselli 
and Lodahl in an investigation of group effec- 
tiveness in a complex task requiring coopera- 
tion among group members to perform the 
task. The Ss were male college students who 
worked in groups of two, three, and four to 
operate a small model railroad by means of 
interrelated control switches. In order for a 
group to perform effectively, the members 


1 The authors wish to thank Karl Hakmiller and 
Robert Messman for their aid in the conduct of this 
study. 

2Now at Pilotless Aircraft Division, Boeing Air- 
plane Company. 


345 


had to organize themselves so that the cor- 
rect switches were operated at the proper 
time; the task thus required coordination of 
the efforts of the several individuals. Ghiselli 
and Lodahl related the pattern of scores on 
the DMA scale achieved by the individuals in 
a group to the effectiveness of the group on 
the task. ‘They found that the mean DMA 
score of a group was not related significantly 
to effectiveness, whereas the positive skewness 
of DMA scores in a group (i.e., “the differ- 
ence between the highest and next highest 
scorer, and this difference minus the range of 
the lower scorers”) correlated .82 with group 
effectiveness. It appears that the best per- 
forming groups were those in which one per- 
son scored high on the DMA (i.e., was to- 
ward the top management end of the scale) 
relative to the other members of the group, 
and where the lower scores of these other 
members formed a relatively small range. 

In a later study of industrial work groups, 
Ghiselli and Lodahl again found the pattern 
of DMA scores within a group to He impor- 
tant, this time in relation to the merit ratings 
of the foremen who directed the groups. It 
was found that there was a negative correla- 
tion (—.57) between a positively skewed pat- 
tern of scores of workers in a group and the 
merit rating given the foreman of that group. 
This negative correlation was interpreted as 
indicating that “when a foreman happens to 
be assigned to a work group which has con- 
siderable capacity for self management, higher 
management will regard him less well than if 
he happens to be assigned to a work group 
with little capacity for self management” 
(Ghiselli & Lodahl, 1958b, p: 185). 

The clear implication of both of the Ghiselli 
and Lodahl studies is that individuals’ DMA 
scores are indicative of certain types of be- 
havior in the work situation which affect the 
performance of a group. However, neither 





346 


study presented evidence as to whether an in- 
dividual actually behaves in the task situation 
as he describes himself on the DMA scale. 
The present study is aimed at this type of 
investigation. It is hypothesized that indi- 
viduals scoring high on the DMA scale will 
behave more like top management personnel 
describe themselves than will people scoring 
low. Specifically, in a situation requiring 
several individuals to interact with each other 
in order to perform a task, those high on the 
DMA should make a higher ratio of goal- 
setting, role-assignment, and task-performance 
suggestion comments to comments of an 
evaluative or cautionary nature, than should 
those scoring low on the DMA. It is fur- 
ther hypothesized that these differential be- 
havior patterns will be perceived by the group 
members themselves, and that therefore their 
ratings of their own group members on ques- 
tions dealing with these patterns of behavior 
will be in accordance with the members’ DMA 
scores. 


Method 
Subjects 


Sixty male undergraduate students, composing 20 
. groups of three persons each, served as the Ss in this 
experiment. 


Procedure and tasks 


At the beginning of an experimental period, each 
of the three Ss in a group filled out Ghiselli’s Self 
Description Inventory (SDI). After completing this 
inventory, the Ss were given instructions which in- 
formed the group of the following: (a) The group 
task was to construct a structure out of 8 X 11-in. 
fiberboard cards, with the goal of making the struc- 
ture as high as possible. (b) There were to be six 
two-min. building periods, with each followed by a 
one-min. discussion interval in which no construc- 
tion could take place; construction was to start anew 
at the beginning of each two-min. period. (c) Talk- 
ing was permitted at any time. (d) Certain restric- 
tions regarding the use of cards in construction were 
imposed. (e) The group’s score for each building 
period would be the number of stories left standing 
at the end of the period, minus three points for each 
“wreck” (collapse of structure) that occurred during 
the period; the total group score would be the total 
number of net points for the six construction periods. 

During the six two-min. building periods and 
the intervening one-min. discussion periods, one E 
counted and classified the various comments made 
by Ss to each other. (These data will be referred 
to as the Verbal Interaction data.) At the same 
time, another E observed the physical behavior of 


Lyman W. Porter and Roger A. Kaufman 


the Ss in the construction of the structures and 
ranked them after each building period on two dif- 
ferent types of physical behavior (discussed below 
under Physical Behavior). At the conclusion of the 
final building period the Ss filled out a questionnaire 
on which they were asked to rank the group mem- 
bers (including themselves) in response to each of 
nine questions regarding their performance in the 
task situation. (These data will be referred to as 
the Peer Ranking data.) 


Behavior measures 


The patterns of behavior that were predicted to 
accompany high DMA scores will be termed top 
management behavior, and those that were predicted 
to accompany low DMA scores, middle management 
behavior. Although such contrasting patterns of be- 
havior may not be unique to top and middle man- 
agement categories of individuals, the above terms 
will be used because of the nature of the origin of 
the DMA scale. 

Verbal interaction. As previously noted, one of 
the Es counted and classified the various comments 
made by each S in each group’s experimental session. 
Each comment or conversational unit was classified 
into one of the following three categories that had 
been set up in advance: (a) top management—these 
conversational units included such types of com- 
ments as goal setting ideas, directions, organization 
ideas, assignment of roles, and major suggestions for 
changes in procedure or attack on the building prob- 
lem; (6) middle management—these conversational 
units were remarks where the person evaluated ideas 
of others, asked for clarification of others’ ideas, or 
elaborated on others’ ideas; and (c) other—this cate- 
gory included all remarks which could not meaning- 
fully be classified into either the top or middle cate- 
gory. 

A Verbal Interaction score was obtained for each 
individual by computing the ratio of the number of 
his top comments to the number of his middle com- 
ments. This ratio of top/middle was used in all re- 
sults involving Verbal Interaction data, and is the 
primary behaviora! data obtained in the study. 

Physical behavior. While one E was recording the 
verbal interaction of the Ss, another E was observ- 
ing the physical behavior of the Ss as they attempted 
to build their card structures. Que to the relative 
lack of discrete units of physical behavior in a task 
such as this, E ranked each of the three Ss at the 
end of each two-min. building session on how much 
of each of two types of behavior the S had exhibited 
during the period. The two types of physical be- 
havior that were rated corresponded as nearly as 
possible to the top and middle types of verbal be- 
havior. These two types of physical behavior were: 
(a) top—actively building the structure by taking 
the initiative in adding new parts to it; and (b) 
middle—assisting the construction by holding cards 
in place while others placed new ones on top, and 
checking parts of the structure to make sure it was 
stable, solid, etc. 





Management Scale and Behavior in a Group Situation 


A Physical Behavior score was obtained for each 
individual by weighting each ranking (three points 
for a rank of one, two points for a rank of two, and 
one point for a rank of three) for each two-min. 
session, totaling the ranks for the six sessions, and 
then obtaining a ratio of top ranks to middle ranks 
for the individual. 


Peer ranking measures 


The questionnaire administered to each S at the 
conclusion of the six building periods contained nine 
questions concerning the past task performance or 
future expected performance of the group’s mem- 
bers. On each question the S was asked to rank the 
three group members. Of the nine questions, six had 
been designated beforehand as referring to top man- 
agement types of behavior or traits, and three to 
middle management. Examples of top management 
questions were: “Who in your group was the most 
resourceful and enterprising?” and “Who in your 
group would make the best president of a large cor- 
poration?” An example of middle management ques 
tions was: “Who in your group would make the best 
middle management executive (e.g., production man- 
ager, personnel manager, sales manager, etc.) ?” As 
in the rankings on physical behavior, each ranking 
that a person received was weighted and total rank 
scores for top and middle were obtained. Again, a 
ratio-type top/middle score was obtained for each 
person. 


Group production measure 


Although this study was concerned with the rela- 
tionship between individuals’ DMA scores and as- 
pects of their behavior, the nature of the group’s 
task made it possible to obtain a crude measure of 
group productivity that could be compared with 
other measures of group behavior. The scoring pro- 
cedure described above that was presented to the Ss 
in their instructions was used as the measure of 
group productivity. This measure, however, proved 
to be relatively unreliable (r= .40), and therefore 
group production results will not be further reported 
in this study. 


Results 


Table 1 presents the correlations between 
individuals’ DMA scores and measures of 
their behavior in the task situation. ‘In the 
top half of the table, all correlations are with 
DMA raw scores. The bottom half of the 
table presents correlations where individuals’ 
DMA scores are standard scorés, each indi- 
vidual’s score being based on the mean and 
standard deviation of DMA scores in his 
group of three individuals. 

The correlations that test the primary hy- 
pothesis of this experiment are those between 
DMA scores and Verbal Interaction scores. 


Table 1 


Coefficients of Correlation of Individuals’ DMA Scores 
With Measures of Individual Behavior 
and With Peer Rankings 


(N = 60) 


Variables correlated with DMA 
raw scores 
Verbal interaction (raw scores) 
( 
Physical behavior 
Peer ranking 
Variables correlated with DMA 
standard scores 
Verbal interaction (standard scores) 
Physical behavior 
Peer ranking 


These correlations between the top-middle 
management scale of the SDI and the ratio 
of top to middle management type of verbal 
interaction are .34 (p= .01) where raw 
scores on both measures are used, and .52 
(p = .001) where standard scores are used. 
Not only do high DMA scorers behave ver- 
bally in a more top management manner, but 
they are also seen as doing so by their peers. 
Peer rankings correlate .25 (p= .05) with 
DMA raw scores and 36 (p= .01) with 
DMA standard scores. The Physical Be- 
havior measure correlates insignificantly with 
both raw and standard DMA scores. (Only 
standard scores are used throughout Table 1 
for Peer Ranking and Physical Behavior since 
both measures involve rankings of each indi- 
vidual relative to the other two members in 
his group.) 


Discussion 

The results of this study confirm the hy- 
potheses stated in the introduction. They 
show that scores obtained on a scale devel- 
oped from the differential self-perceptions of 
top and middle management personnel are 
significantly related to the type of behavior 
that a person exhibits in an actual group 
situation where some task must be performed. 
Specifically, those individuals who score higher 
on the WMA scale behave, in terms of verbal 
interaction with others, relatively more like 





348 


top management personnel describe them- 
selves than do persons who score lower on 
the DMA scale. Also, those individuals who 
score higher on the top-middle management 
self-description scale are seen by their peers 
in the group situation to behave and appear 
relatively more like top management indi- 
viduals than do those who score lower on the 
scale. The results would appear to indicate 
that the DMA scale has some validity for 
predicting an individual’s behavior in a situa- 
tion where top or middle management types 
of verbal action are relevant. 

One particular aspect of the correlations 
between individuals’ DMA scores and meas- 
ures of their behavior that are presented in 
Table 1 deserves further comment. It will be 
noted in this table that the correlations are 
consistently higher when DMA _ standard 
scores rather than raw scores are used. This 
indicates that predictions made from the 
DMA regarding a person’s actual behavior in 
a situation will be more valid when something 
is known about the DMA scores of the other 
individuals in the group. This finding lends 
support to the more general notion that some- 
thing must be known about the characteristics 


of all the members of a group if knowledge 
of a particular individual’s traits (whether 
gained through self-description or otherwise) 
is to be used most accurately to predict his 


behavior in a social situation. An individual's 
possession of particular traits or character- 
istics is important in determining his behav- 
ior in a group situation, but also important is 
the distribution and strength of his charac- 
teristics in relation to the distribution and 
strength of characteristics possessed by others 
in the group. 


Summary 


Previous studies have described the devel- 
opment of a top-middle management scale de- 


Lyman W. Porter and Roger A. Kaufman 


rived from the differential answers of top and 
middle management personnel to a self-de- 
scription inventory. The present study was 
designed to determine whether individuals 
whose answers on this self-description inven- 
tory give them a high score on the top-mid- 
dle management scale behave in a group task 
in accordance with their relatively “top man- 
agement” self-descriptions. 

Sixty male undergraduates serving in groups 
of three were the Ss in this investigation. 
Each group of three Ss first completed the 
self-description inventory individually and 
then participated in a group task involving 
building a structure out of 8 X 11-in. fiber- 
board’ cards. The task required group co- 
operation and considerable verbal interaction. 
The comments of Ss while performing the task 
were categorized by one of the s into top 
or middle management types of remarks. At 
the conclusion of the group task session, Ss 
ranked their peers on questions dealing with 
top and middle management types of behavior. 

The results showed that individuals’ scores 
on the top-middle management self-descrip- 
tion scale were correlated significantly with 
the type of verbal interaction exhibitec in the 
experimental session. Also, the self-descrip- 
tion scores correlated with the peer rankings 
made by the group members. 

Received February 16, 1959. 
REFERENCES 
Guisetui, E. E. The forced-choice technique in self- 

description. Personnel Psychol., 1954, 7, 201-208 
Guisetur, E. E., & Lopant, T. M. Patterns of 

managerial traits and group effectiveness. J. ab- 

norm. soc. Psychol., 1958, 57, 61-66. (a) 
Guisett, E. E., & Lopant, T. M. The evaluation 

of foreman’s performance in relation to the inter- 

nal characteristics of their work groups. Person- 

nel Psychol., 1958, 11, 179-188. (b) 

Porter, L. W., & Guiseti, E. E. The self-percep- 
tions of top and middle management personnel. 

Personnel Psychol., 1957, 10, 397-406. 








Outstanding 
MeceGRAW-HILL 


Boohs 


PRINCIPLES OF COMPARATIVE PSYCHOLOGY 


By ROLLAND H. WA Unive of Florida; DOROTHY A. RETHLINGSHAFER, University of 
orida; and WILLARD E. CALD The George W: University. McGraw-Hill Series in 
Psychology. Ready in January. 


This book represents the combined efforts of a group of distinguished eee working in the field of 
animal study. The is to present a broad coverage of the field behavior—to acquaint the 
reader with the meth and results of comparative studies and the principles to be drawn from them. 
Emphasis is placed upon current experimental and theoretical literature, without neglecting the standard 
material to be expected in such a survey. 





THE PSYCHOLOGY OF LEARNING 


By JAMES DEESE, poy ny Hopkins University. New Second Edition. McGraw-Hill Series in Psy- 
chology. 384 pages, $6.50. 


A completely rewritten and greatly improved revision in an upper-division text. The book attempts to give 
the student a representative picture of the basic facts and theoretical problems in the psychology of learning. 
There is os emphasis on experimental evidence. Theories of learning are treated in the context of 
particular problems, and the theoretical emphasis is upon the analysis of problems rather than upon differ- 
ences between theoretical “schools.” 


PSYCHOMETRIC METHODS 


By J. P. GUILFORD, University of Southern California. McGraw-Hill Series in Psychology. Second 
Edition. 604 pages, $8.75. 


Thoroughly revised and ongenies, the second edition of Psychometric Methods presents the same com- 
rehensive treatment of phases of psychological measurement that distin, ed the first edition. 
ew material includes sections on the theory of psychological measurement, psychophysical theory, mathe- 

matics nece' for an understanding of psychometric methods, new paychophysical meth principles 

Pp. 


of judgment and current major a es to psychological-test theory. asis throughout th 
facdanental unity of all the he measurement methods. ” — 


HUMAN ENGINEERING 


By ERNEST J. McCORMICK, Occupational Research Center, Purdue University. 467 pages, $8.00. 


A nontechnical introductory survey book dealing with the design of equipment and the adaptation of work 
environments for optimum human use. It summarizes much of the work that has been done in human 
yo nea ag professions as physiology and medicine, with emphasis on the contributions of psy- 
chology. understanding of these functions is developed through discussion of human information- 
receiving, decision-making, and action processes. 








Send for copies on approval 


BOOK COMPANY, INC. 
330 West 42nd Street New York 36, N. Y. 





























LOOKING FOR A BOOK PUBLISHER? 


It is no secret that publication is regarded as the foundation stone of a 
scholar’s career. In order to gain recognition, prestige, and advancement, the scholar 
must seek publication. 


Exposition Press offers scholars a complete publishing service, 
under our special academic imprint, Exposition-University Books. Though the pro- 
portion of scholarly publications in the lists of the trade houses is shrinking, due to 
economic pressures, several of our recent titles in the academic fields have gained 
popular reviews, professional recognition and steady sales. The books are the result 
of a publishing plan that is bringing a steady flow of scholarly writers to Exposition 
Press. They reflect the high editorial standards and quality of design and produc- 
tion which have won the respect of libraries, schools, booksellers and critics. 


FREE! The behind-the-scenes story of book publishing revealed in two fact- 


filled, illustrated brochures, containing a detailed description of our 
40%-royalty subsidy plan including a breakdown of contract terms and typical costs 
(in print for the first time). Copies are available on request. Your inquiries and 
manuscripts are invited. An editorial appraisal will be furnished promptly without 
obligation. 


Please write to John Priestly, Editorial Department. 
EXPOSITION PRESS, Main Office, 386 Park Ave. S., New York 16, N.Y. 

















ap... 





a 


” 


